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(57) Abstract: Many Gram-negative pathogens 
assemble adhesive structures on their surfaces that 
allow them to colonize host tissues and cause disease. 
Novel compositions for the prevention or inhibition 
of pilus assembly in Gram-negative pathogens are 
disclosed. Interacting with the binding site of pili 
subunits will negatively affect the chaperone/usher 
pathway which is one molecular mechanism by 
which Gram-negative bacteria assemble adhesive pili 
structures and thus prevent or inhibit pilus assembly. 
Additionally, novel compounds and compositions for 
interfering or preventing adhesion of piliated bacteria 
to host tissues are provided. Such compounds and 
compositions prevent or inhibit pili adhesion to host 
tissues by interacting with the mannose-binding 
domains on pilus adhesin subunits. Also provided are 
methods for the treatment or prevention of diseases 
caused by tissue-adhering pilus-forming bacteria by 
interaction with the binding between pilus subunits; 
the binding between pilus subunits and periplastic 
the host pmthAiiai rW..« ai "'^^^^^^^^^^r^^^^m^ chaperones; and the binding of a pilus adhesin to 
subunS T^S^i ™ ph » icaJ Preparati0ns Capable of Acting with the binding between pilu^ 

co^^7nT^l 1 T Penplasmic chaperones and between the pilus adhesin. The present invention further relates to 

ESJS cha^rone-subuna ^complexes, detailed three dimensional structural information Ulustrating the mteraction 
SSLJr iT X Z n l Pi,U " SUbUnit ** a Chaperone for a P ilus chaperone-subunit co-complex and meS 
antiSterial ^2*^ ^ * '° ^ and screen for co ^™* that exhibU 

ZlZ ?V"T mVCnU ° n u ^ rCbteS t0 maChine readab,e media threeSnensional aSc 

structure coordmatcs of pilus chaperone-subunit co-complex and subsets thereof. 
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ANTI-BACTERIAL COMPOUNDS DIRECTED AGAINST PILUS 
BIOGENESIS, ADHESION AND ACTIVITY; CO-CRYSTALS OF 
PILUS SUBUNITS AND METHODS OF USE THEREOF 

This invention was made in pan with Government support under National Institutes of 
Health Grants R01DK5 1406, R01AI29549 and ROIGM54033. The Government has certain 
rights in the invention. 

This application claims priority to co-pending United States provisional patent 
application Ser. No. 60/148,280, filed August U, 1999, incorporated herein by reference. 

Field of the Invention 

The present invention relates to compounds and methods for the treatment of diseases 
caused by tissue-adhering pilus-forming bacteria. More specifically, the invention relates to 
pharmaceutical preparations comprising substances ctpsble of incerferins with the binding of 
periplasmic chaperones to pilus subunits as well as pharmaceutical compounds capable of 
interfering with.the binding between pilus subunits. 

The present invention further relates to crystalline forms of pilus-subunit co- 
complexes, the high-resolution X-ray diffraction structures and atomic structure coordinates 
obtained therefrom. The pilus subunit co-crystals of the invention and the atomic structural 
information obtained therefrom are useful for solving structures of related proteins, and for 
screening for, identifying and/or designing compounds that bind periplasmic chaperones or 
pilus subunits and thus prevent the assembly and/or biological function of pili. 

Background of the Invention 

Many pathogenic Gram-negative bacteria such as Escherichia coli, Haemophilus 
influenzae. Salmonella enteriditis, Salmonella typhimurium, Bordetella pertussis, Yersinia 
enterocolitica, Yersinia per stis, Helicobacter pylori and Klebsiella pneumoniae assemble 
hair-like adhesive organelles called pili on their surfaces. Pili are thought to mediate 
microbial attachment, often the essential first step in the development of disease, by binding 
to receptors present in host tissues and may also participate in bacterial-bacterial interactions 
important in biofilm formation. 
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Uropathogenic strains of E. coli express P and type 1 pili that bind to receptors present 
in uroepithelial cells. Adhesive P pili are virulence determinants associated with 
pyelonephritic strains of E. coli whereas type 1 appear to be more common in £ coli causing 
cystitis. The adhesin present at the tip of the pilus, PapG binds, to the Gal (l-4)Gal moiety 
present in the glycolipids and glycoproteins, while the type 1 adhesin, FimH, binds D- 
mannose present in glycolipids and glycoproteins. 

Type 1 pili are adhesive fibers expressed in E. coli as well as.in most of the 
Enterobacteriaceae family. The type 1 pilus is a right handed helix with about 3 subunits per 
turn, a diameter of approximately 70 A, a central pore of about 20-25 A, and a rise per ■ 
subunit of about 8 A. See G.E. Soto et al, EMBOJ., 17: 6155 (1998). Type 1 pili are 
composite structures in which a short tip fibrillar structure i containing FimG and the FimH 
adhesin (and possibly the minor component FimF as well) are joined to a rod comprised ; 
predominantly of FimA subunits. See Jones et al., Proc. Natl. Acad. Sci. U.S.A., 92: 2081 
(1995): The FiniH adhesmMe 

Abraham et al., Nature, 336: 682 (1988); K A. Krogfeltet al., Infect. Immun., 58: 1995. 
(1990). In uropathogenic E. coli, this binding event has been shown to play a critical role in 
bladder colonization and disease. 

; T yP e 1 pilus biogenesis proceeds by way of a highly conserved chaperone/usher 
pathway that is involved in the assembly of over 25 adhesive organelles in the Gram-negative 
bacteria. See G.E. Soto and S. Hultgren,./. Bacteriol, 181: 1059 (1999). The usher forms an 
oligomeric channel in the outer membrane with a pore size of approximately 2.5 nm and 
mediates subunit translocation across the outer membrane. See D.G. Thanassi et al., Proc. 
- -NatlrAcadrUSA- 95 : 3 146 ( 1 998): . ' ' •-- - : : : - ; - 

P pili is a heteropolymeric surface fiber with an adhesive tip and consists of two major 
sub-assemblies, the pilus rod and the tip fibrillum. The pilus rod is a thick rigid rod made up 

of repeating PapA subunits arranged in a right-handed helical cylinder whereas the tip 
fibrillum is a thin, flexible tip fiber extending from the distal end of the pilus rod and is 
composed primarily of repeating PapE subunits arranged in an open helical configuration. 
Two components of the tip fibrillum, PapK and PapF, act as adaptors. PapK is thought to 
link the pilus rod to the base of the tip fibrillum and regulates the length of the tip fibrillum: 
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its incorporation terminates its growth and nucleates the formation of the pilus rod PapF is 
thought to join the PapG adhesin to the distal end of the flexible tip fibrillum. 

The biogenesis of P pHi also occurs via the highly conserved chaperone/usher 
pathway. See T.G. Thanassi et al., Curr. Opin : Microbiol, 1 : 223 (1998); D.L Hung et al 
EMBOJ., 15: 3792 (1996). P pili are adhesive organelles encoded by eleven genes in the^ 
(Eilus associated with pyelonephritis) gene cluster found on the chromosome of 
uropathogenic strains of E. coli. Six genes encode structural pilus subunits, PapA, PapH, 
PapK, PapE, PapF and PapG. See S.J. Hultgren et al., Cell 73: 887 (1993). 

In P pili, two of the genes in ttepap operon^D and/^C, encode the chaperone 
and usher, respectively. Chaperones such as PapD in E. coli are required to bind to pilus 
proteins imported into the periplasmic space, partition them into assembly component 
complexes and prevent non-productive aggregation of the subunits in the periplasm See 
KuehnM.J.etal.,/>roc. Natl. Acad. Sci. USASS: 10586(1991). PapD is a periplasmic 
chaperone that mediates the assembly of P pili. Detailed structural analysis has revealed that 
the PapD chaperone is the prototype member of a conserved family of periplasmic 
chaperones in Gram-negative bacteria. Periplasmic chaperones consist of two 
immunogloblin-like domains with a deep cleft between the two domains. See A. Holmgren ' 
and C.I. Branden, Nature, 342: 248 (1989); M. Pellecchia et al., Nature Struct. Biol, 5: 885 
(1998). Further, all members of the periplasmic chaperone superfamily have a conserved 
hydrophobic core that maintains the overall features of the two domains. 

Periplasmic chaperones, along with outer membrane ushers, constitute a molecular 
mechanism necessary for guiding biogenesis of adhesive organelles in Gram-negative 
bacteria. These chaperones function to cap and partition interactive subunits imported into 
the penplasmic space into assembly competent co-complexes, making non-productive 
mteractions unfavorable. The chaperone-subunit co-complexes are targeted to the outer 
membrane usher where subunits, or ushers, assemble in a specific order to form a pilus 
During pilus biogenesis, PapD binds to and caps interactive surfaces on pilus subunits and 
prevents their premature aggregation in the periplasm. PapD binds to each of the pilus 
subunit types as they emerge from the cytoplasmic membrane and escorts them in assembly- 
competent, native-like conformations from the cytoplasmic membrane to outer membrane 
assembly sites comprised of PapC. PapC has been termed a molecular usher since it receives 
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chapercne-subanit co-complexes and incorporates, or ushers, the subunits from the chaperone 
co-complex into the growing pilus in a defined order 

. In the absence of an interaction with the chaperone, pilus subunits aggregate and are ' 
proteolyticallydegraded. Kolmer etal. and Jones etal. have shown that the DegP protease 
degrades pilus subunits in the absence of the chaperone. See J. Bacterial. 178: 5925 (1996)- 
BIBO. 16:6394(1997). This discovery led to the elucidation of the fate of pilus subunits ' 
-expressed fa theprewnceor ^ce.tfthe cWpenfae ^monospecific antiscre fa Western 
blots of cytosolic membrane, outer membrane and perplasmic proteins prepared according to 
methods known in the art. 

Thus, prevention or inhibition of normal pilus assembly in Gram-negative bacterium 
-pacts the pathogenicity of the bacterium by preventing the bacterium from attaching to and 
infecting host tissues. Moreover, changes in the binding between pilus subunits and 
chaperones can have a dramatic impact on the efficiency of pilus assembly, and thus on the 
a bl hty of Gram-negative bacterium to adhere to and consequentially, infect host tissues 
Prevents and inhibition of binding between pilus subunits and between pilus subunits and ' 
penplasmic chaperones have the effect of impairing pilus assembly, whereby the inactivity 
of the Gram-negative bacterium expressing the piU is reduced. Accordingly , a need exists fa 
general, for compositions and methods for preventing or inhibiting the normal interaction 
between p.lus subunits and/or between a pilus subunit and a chaperone. 

However, identification of such compositions has heretofore relied on serendipity 
and/or systematic screening of large numbers of natural and synthetic compounds. Afar 
superioj me^ The three ~ ~ 

dimensional structures of proteins or protein fragments are determined and potential agonists 
and/orpotential antagonists are designed with the aid of computer modeling. However 
heretofore the three-dimensional structure illustrating the interaction between pilus subunits 
and/or between a pilus subunit and a chaperone has remained unknown, essentially because 
no such protein co-crystals had been produced which would permit the required X-ray 
crystallographic data to be obtained. 

Therefore, there is presently a need for obtaining a co-crystal of a co-complex of a 
p.Ius and a chaperone to allow such crystallographic data to be obtained. Furthermore there 
is a need for the determination of the three-dimensional structure of such co-crystals. Finally 
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there is a need for procedures for related structural based drug design based on such 
crystallographic data. 



Summary n f the Invpntinn 

Accordingly, the present invention provides antibacterial compositions and 
compounds capable of inhibiting or preventing pilus assembly in a Gram-negative bacterium. 
Such compounds interfere with the function of chaperones required for the assembly of pili 
from pilus subunits in diverse Gram-negative bacteria. Another object of the invention is to 
provide compounds having antibacterial activity that prevent or inhibit pili assembly by 
interfering with the interactions between pilus subunits. Yet another object of the invention is 
to provide compounds capable of inhibiting or preventing the function of pili adhesion to host 
epithelium thereby reducing the capacity of bacteria to attach to and infect host tissues. It is 
further object of the invention to provide antibacterial compounds which have broad 
specificity for a diverse group of Gram-negative bacteria. Other objects include the provision 
of methods of preventing and inhibiting pilus assembly, methods of preventing or inhibiting 
pili adhesion to host tissues, methods of treating bacterial infections, methods for preventing 
and inhibiting biofilm formation and methods of preventing colonization by various Gram- 
negative bacterium. 



a 



Another aspect of the invention is to provide crystalline forms of polypeptides 
corresponding to a pilus chaperone-subunit protein co-complex. Thus, further objects of the 
present invention include the provision of the atomic structure coordinates obtained from the 
pilus chaperone-subunit co-crystals and methods of utilizing the three dimensional structural 
information obtained from the co-crystals to design or identify compounds with antibacterial 
activity. Another related object is to provide machine- or computer-readable media 
embedded with the three-dimensional structural information obtained from the pilus 
chaperone-subunit co-complex, or portions or subsets thereof which can be used to identify or 
design antibacterial compounds. A further object is to provide methods of making the co- 
crystals of the invention. 

Therefore, in one aspect, the present invention is directed to isolated and purified 
compounds and synthesized compounds which bind to a pilus subunit groove and thus inhibit 
P«lus assembly. Preferably, such compounds mimic the binding activity of the G, beta-strand 
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ofa periplasmic chaperone and comprise a polypeptide having an amino acid sequence 
containing at least two alternating hydrophobic amino acid residues. In a preferred 
embodiment, this polypeptide would be derived from a G, beta-strand ofa periplasmic 
chaperone, more preferably, this polypeptide would be comprised of amino acids derived 
from the N101 to L107 amino acid region ofa G, beta-strand ofa periplasmic chaperone. A 
particularly preferred antibacterial compound which comprises a peptide comprising an 
amino-terminal amino acid sequence Asn-Val-Leu-Gln-Ile-Ala-Leu (SEQ ID NO: 1) or any 
related analogues that would competitively bind to the binding site of a pilus subunit. 

In another embodiment, such compounds mimic the binding activity of the amino- 
• . terminal end' of apilus subunit and comprise a polypeptide having an amino acid sequence 
containing at least two alternating hydrophobic amino acid residues. Such antibacterial 
compounds will competitively bind to a binding site on pilus subunits, thereby inhibiting or 
preventing pilus assembly. A preferred polypeptide would be derived from the sequences of 
• conserved, amino-terminal. motifs of pilus subunits. A particularly preferred antibacterial 
compound comprises a peptide comprising an.amino-terminai ant acid sequence Ser-Asp- 
Val-Ala-Phe-Arg-Gly-Asn-LeU-Leu (SEQ ID' NO: 12) or any related analogues that would 

competitively bind to the binding site ofa pilus subunit. - - - - 1 - 

A further object of the invention is to provide compounds which mimic mannose by 
binding to the amino-terminal end of the FimH adhesin. Such antibacterial compounds will 
bind to the mannose-binding site on pilus adhesins, thereby inhibiting or preventing the 
function of the pili to attach to and infect host tissues. 
_^ f ^cewto 

tissues are particularly effective since both the formation of pili and attachment of pili to host 
tissues are essential to bacterial pathogenicity. As such, the invention further provides 
compositions containing the above compounds in conjunction with a pharmaceutically- 
acceptable carrier, excipient or diluent. Also provided are methods of preventing or 
inhibiting pilus assembly in a Gram-negative bacterium by administering an effective amount 
ofa compound capable of interfering with the binding of pilus subunits and all pilus subunit 
homologues. The invention is also directed to methods of preventing or inhibiting the 
pathogenicity, of a Gram-negative bacterium comprising administering an effective amount of 
a compound capable of interfering with the adhesion of pili to host tissues. Further provided 
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are methods for treating Gram-negative infections which comprise providing to a subject an 
effective amount of the above compounds and compositions. 

Further, the present invention is directed to methods for preventing or inhibiting 
, biofilm formation on a surface or in an environment containing Gram-negative bacteria. Also 
provided are methods for inhibiting bacterial colonization by a Gram-negative organism. 
These methods are accomplished by administering to such surfaces and environments an 
effective amount of a compound or a composition which is capable of interfering with pilus 
assembly or the ability of the pilus to adhere to and subsequently infect host tissues. 

In another aspect, the invention provides compositions comprising crystalline forms 
of polypeptides corresponding to the PapD-PapK chaperone-pilus subunit protein co^ 
complex. The PapD-PapK co-crystals comprise crystallized polypeptides corresponding to 
the wild-type or mutated PapD-PapK co-complexes. The PapD-PapK co-crystals preferably 
include native co-crystals, heavy-atom atom derivative co-crystals and co-crystals of a PapD- 
PapK co-complex that is further associated with one or more other molecules or compounds. 
Preferably, such other compounds bind to a site involved in protein-protein interactions in the 
pilus. 

The PapD-PapK co-crystals are generally characterized by a spacegroup of P2,2 2 " 
and. unit cell eft- 62.1 * 0.2 A, b= 63.6* 0.2 A, c= 92.7 * 0.2 A, and are preferably of " 
diffraction quality. In a preferred embodiment, the PapD-PapK co-crystals are of sufficient 
quality to permit the determination of the three-dimensional X-ray diffraction structure of the 
crystalline polypeptide co-complex to high resolution, preferably to a resolution of greater 
than about 3 A, typically in the range of about 1 A to about 3 A. 

The invention also provides methods of making the co-crystals of the invention. " 
Generally, co-crystals of the invention are grown by dissolving substantially pure 
polypeptides in an aqueous buffer that includes a precipitant at a concentration just below that 
necessary to precipitate the polypeptide. Water is then removed by controlled evaporation to 
produce precipitating conditions, which are maintained until co-crystal growth ceases. 

In another aspect, the invention provides machine- or computer-readable media 
embedded with the three-dimensional structural information obtained from the PapD-PapK 
co-crystals of the invention, or obtained from FimC-FimH co-crystals, or portions or subsets 
thereof. Such three-dimensional structural information will typically include the atomic 
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structure coordinates of ^ crystallized polypeptide co<omplex, or ^ 
coordinates of a portion thereo f, such as, for example, the atomic structure coordinates of one 
member of the co-complex or an active or binding site of one or both members, but may 
include other structural information, such as vector representations of the atomic structure 
coordinates, etc. 

Thus, the atomic structure coordinates and machine readable media of the invention 
have a variety of uses. As such, provided are methods of identifying antibacterial compounds 
which utilize the coordinates for solving the three-dimensional X-ray diffraction and/or 
solution structures of other proteins, including mutant co-complexes, co-complexes further 
associated with other molecules, and unrelated proteins, to high resolution. Structural ■ 
mformation may also be used in a variety of molecular modeling and computer-based 
screening applications to, for example, intelligently design mutants of the crystallized PapD- 
PapK or FimC-FimH co-complexes having altered bioiogical activity and to computationally - 
design and identify compounds that bind the polypeptide co-complexes or a portion or ' 
fragment of the polypeptide co-complexes, such as the manhose binding site of FimH and/or 
the G, beta strand binding cleft of PapK. 

. In another aspect, the present invention provides methods of using the coordinates of - - 
the Pa P D-PapK co-complex or ofthe FimC-FimH co-complex, or subsets of such structure 
coordinates, to design or identify candidate compounds capable of binding to a binding site 
on one member of the co-comp^^ Such candidate 

compounds may be evaluated for biological activity, such as, for example, the ability to bind 
(preferably competi^ _ _ 

subunit assembly and/or the ability to avoid adherence of a Gram-negative bacterium to a • 
host tissue. In one. embodiment, the co-crystals from which the PapD-PapK co-complex 
structure is derived have the space group and cell dimensions described above, such that the 
three dimensional structure ofthe co-complex is provided to a resolution of from about 3 0 A 
to about 2.4 A or greater. In another embodiment, the co-crystals from which the FimC- 
F,mH co-complex structure is derived have the space group P4.2.2 or P4, with unit cell 
dimensions of a=b= 97.7 +/ - 0.2 A and c= 215.9 + /- 0.2 A, such that the three dimensional 
structure ofthe co-complex can be determined to a resolution of from about .3.0 A to about 
2.5 A or greater. 
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In a further aspect of the invention, such potential compounds are evaluated for 
biological activity. Candidate antibacterial compounds are designed or identified using the 
atomic structure coordinates of the PapD-PapK or FimC-FimH co-complexes or subsets 
thereof, synthesized and screened for their ability to bind to pilus subunits, thereby inhibiting 
or preventing pilus biogenesis. The antibacterial activity of the compound is determined by 
assaying the bacterium for infectivity or monitoring the pilus for activity. Alternatively 
compounds designed or identified based upon their ability to bind the mannose binding ' 
domain of FimH are synthesized and screened for their ability to bind FimH. Such 
compounds that are able to prevent or inhibit pilus biogenesis or the ability of the bacterial 
pilus to attach to a host tissue can be used in the compositions of the present invention. 

Other objects and features will be in part apparent and in part pointed out hereinafter. 

Brief Desc ription of Fig m-Ac 
Figl A is a depiction of representative regions of the electron density of a PapD G, 
beta-strand. Electron density is from a simulated annealing omit map calculated using the 
phases derived from the final model where the PapD G, beta-strand residues 101 to 108 have 
been omitted. Strands are labeled. 

Fig IB is a depiction of representative regions of electron density shown in PapD G, 
beta-strand zippering to the PapK F strand. The density is from a map calculated using 
unbiased experimental MAD solvent-flattened phases. 

Fig 1C is a view from the hydrophobic core of PapK looking out toward the PapD G, 
beta-strand that inserts into the groove of the subunit. Residues throughout are labeled The 
density is from a map calculated using unbiased experimental MAD solvent-flattened phases. 

Fig. 2A is a schematic of a stereo ribbon diagram. Subscripts 1 and 2 refer to 
domains 1 and 2 of PapD, respectively. 

Fig 2B is a stereo ribbon diagram. The molecular surface of PapK, calculated and 
displayed using GRASP. The structure of PapD is shown as a ribbon. The insertion of the G, 
beta-strand of PapD into a deep groove on the surface of PapK can be seen. 

Fig. 3A is the topology of PapK. Beta-strands are indicated as arrows, while helices 
(either a or 3 j 0 ) are shown as cylinders. 



WO 01/10386 

PCT/US00/22087 

10 • . 

• Fig. 3B isa depiction ofthe sequence alignment of P-pilus subunits (PapA PaoK 
, PapE,an d P apF , ^^^^^^J^^*^ 
sequence, Residue numbers of PapK are indicated above the PapK sequence T* e 

tot all pUins have stmctures similar to PapK. 

Fig. 3C is a depiction of me secondary ^redefinition ot PapD . Residue numbm 

Pan*™ SUPe,P0Sid0n ^.-"^ ° f ap °- P * D **> =o» P .exed „ 

PapKL The arrow , nd.ca.es d. confon^.ional change in .he F,. G , loop upon subunit 
binding. * 

■ . to****^?***^^^^,^ Onfc.eft.PapDis 

On tire ngh,, PapK is shown Ja • 
mode, and PapD as a Hbbon. The various hiding siKS j, dofeJ m ^ „ 

FiS - (A " a ! " contact diag^ cfin^c.,,,. between PapD and 

" R6SHUeS C ° MaC B <° «* ~a.ion (rtnn 

. for PapD, and thick for PapK). ;,, J. 1 

the roo? " !S " * * COrJMt ^ •"-«'«» berween PapD and 

s^'™ 8 ^ 0 ^ ^^-'^A-d.heCOOH-.e^ 
s,a„dFfo m ,hes.des of t h= gro oVei„ Pap K . Residues nrabngcon^are shown in stick 
representaaon (thin for PapD, and thick for PapK). 

dom,n2„fPapD. Res.dues making contaos are shown in stick represent (Unh for ' ' 
PapD, and .hick for PapK). 

Fig. «D is a schematic of as.ereo comae, diagram of interactions benveen the C 
representation (rain for PapD, and thick for PapK). 

Pan* n?*''^^**^^**'******^.* . 
PapK ^PapDCstiandisrepresented.astickmodeiwi.hcoiorcodingasinFia.aA 
and PapK is shown as a molecular surface calculated using GRASP. Notice the " 
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predominance of hydrophobic residues in the groove/the base of which is part of the 
hydrophobic core of the protein. 

Fig. 7A is a schematic diagram of subunit-subunit interactions in pilus rod model as' 
viewed from above. Insertion of the ^-terminal strand of one subunit into the groove made 
by the A2 and F strands of the preceding subunit such that the NH r terminal strand is parallel 
to strand F results in a three-pointed-star-shaped cross-section inconsistent with electron 
microscopy data Strands (arrows) are labeled, as are the NH 2 - and COOH-termini (N and C 
respectively). Hydrogen bonding interactions are shown schematically. 

Fig. 7B is a schematic diagram of subunit-subunit interactions in pilusmodelas 

viewed from above. Insertion of the ^terminal strand antiparallel to strand F yields a . 

cross-section compatible with electron microscopy data Strands (arrows) are labeled- as, are 

the NH 2 - and COOH-termini (N and C respectively). Hydrogen bonding interactions are 

shown schematically. 

Fig. 7C is a molecular surface of a pilus rod (program GRASP). The disordered 
residues at the NH r terminus of the subunit were modeled as a strand that inserts into the 
groove of the preceding subunit. Approximately^ turns of the model pilus, whose 
dimensions.are similar to the : known values- from electron microscopy are shown. v ' 

Fig. 7D is a stereo ribbon diagram of the rod model. The insertion of the NH 2 - 
terminal strand of one subunit into the groove of thepreceding subunit can be clearly seen. *. 

Fig. 8A depict the amino acid sequences of type 1 pilus subunits (FimA, FimF, FimG, 
FimH). The end of the mannose binding lectin domain and the start of the pilin domain in ' 
FimH are indicated by vertical arrows above the sequences. Type 1 pilin subunits (FimA, 
FimF. FimG) were aligned with the pilin domain of FimH using Clustal W and manually 
adjusted to minimize gaps in secondary structure elements. Gaps in the alignment are 
indicated by dots. Sequence numbering for FimH starts at position 22 in the pie-protein. 
Residues involved in chaperone binding are indicated by an open circle above the residue. . 
Residues in the carbohydrate binding pocket are boxed. A large box marks the NH 2 -terminal 
extensions in the pilin subunits. The conserved b-zipper motif found in all pilin subunits 
corresponds to the F beta-strand. Limits and nomenclature for secondary structure elements 
are shown below the sequence. • 
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Fig 8B are beta-sheet topology diagrams of the mannose binding domain (left) and 
. pilin domain (right) of FimH. 

{ Fig 9A is a typical sample of the solvent flattened experimental electron density map 
(contoured at 1.0a) with the refined model superimposed. Arg* and Lys" 2C anchor the 
COOH-terminus of FimH in the subunit binding cleft of the chaperone via hydrogen bonds to 
the terminal carboxylate. 

Fig. 9B isaMOLSCRIPT ribbon diagram oftheFimC-FimH co-complex. A ball- 
and-stick representation of the C-HEGA molecule bound to the lectin domain of FimH 
indicates the position of the carbohydrate-binding site at the tip of the domain. 

Fig. 10A is a depiction ofFimH carbohydrate binding. A stereo view of the 
carbohydrate binding pocket with a molecule of C-HEGA bound. Residues Phe ,H Ile 13H 
Asn- Asp- Tyr- n e - ^p™, ^ Asn" 1 ", Tyr'- Asn'- Asp'- Phe -«H ^ 

surface ofthe pocket at the tip ofthe lectin domain is shown. Residues that take part in 
. hydrogen bonding to the glucamide moiety of C-HEGA are labeled. - - - , 

Fig. 10B is a depiction ofthe surface ofthe FimH pilin domain showing the exposed 
hydrophobic core. Hydrophobic residues that are in contact with FimC in the co-complex but 
S0lV6nt eXpOSed Upon removal ofthe chaperone are highlighted Ln yellow. Right: as left but 
with FimC ribbon in blue. The sevenm Gl strand of FimC d^ 
complement the incomplete hydrophobic core of the pilin domain. 

Fig. 10C is a close-up of donor strand complementation interactions. Hydrophobic 
residues on the surface of the pilin domain (Val IWH , Ala ,65H , Thr I69H , lie"'* Leu l83H Val mH 
Leu -H Ile „:H Val 2 - and Phe^») and FimC residues involved in donors^ ' 
-com P lementation(Leu'^-Leu'^Ile'^Ser'^^ " 

complete hydrophobic core extending between; the two proteins. 
Fig. 11A is a model of the type 1 pilus. 

Fig. 1 IB is a top view of the type 1 pilus. Residue positions that are subject to allelic 
variation map to the outer surface of the pilus. 

Fig. 11C is a side view of the type I pilus. 

Fig. 12 is a graphic representing the binding of FimH to polypeptides corresponding 
to the Gl beta-strand of FimC and the N-terminal extension of FimC. The two polypeptides 
or FimC were coated onto microliter wells and FimH binding to the immobilized 
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polypeptides or FimC protein was determined by ELIS A using anti-FimH antibodies. The 
graph represents the average of triplicate wells with the standard deviation shown in bars. 

Fig. 13 is a graph which represents the binding of FimH in the presence of increasing 
concentrations of the FimC polypeptide. It can be seen that FimC polypeptides inhibit FimH 
binding to FimC. The graphs represent the average of triplicate wells with the standard 
deviation shown in bars. 

Fig. 14 is a graph which represents the FimH binding to FimC in the presence or 
absence of FimG or FimC polypeptides as monitored by ELISA. The graphs represent the 
average of triplicate wells with the standard deviation shown in bars. 



Abbreviations anri npfiniri^. 

To facilitate understanding of the invention, a number of terms are defined below: 
The amino acid notations used herein for the twenty genetically encoded L- 
acids are conventional and are abbreviated as follows: 



-amino 



Amino Acid 


One-Letter 
Symbol 


Three-Letter 
Symbol 


Alanine 


A 


Ala 


Arginine 


R 


Arg 


Asparagine 


N 


Asn 


Aspartic acid 


D 


Asp 


Cysteine 


C 


Cys 


Glutamine 


Q 


Gin 


Glutamic acid 


E 


Glu 


Glycine 


G 


Gly 


Histidine 


H 


His 


Isoleucine 


I 


He 


Leucine 


L 


Leu 
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Amino Acid 


, One-Letter 
Symbol 


Three-Letter 
Symbol 


Lysine 


K 


Lys 


Methionine 


M 


Met 


Phenylalanine 


F 


Phe 


Proline 


P 


Pro 


Serine 


S 


Ser 


Threonine 


• T 


Thr 


Tryptophan 


W . 


Trp • 


• Tyrosine 




Tyr '.. 


Valine 


V 


Val 



As used herein, unless specifically delineated otherwise, the three-letter and one-letter 
-.ammo-acid abbreviations designate amino acids in either the D-configuration or the L: - f- 
configuration. For example, Arg designates b-arginine and L-arginine, and R designates D- 
arginine and L-arginine. 

Unless rioted otherwise, when polypeptide sequences are presented as a series of one- 
letter and/or three-letter abbreviations, . the sequences are presented in the N C direction in 

J c l^jm^ m9 n m ^ .As used.herein,.»C»refers to the alpha carbon of an ■ 
amino acid residue. 

For purposes of detennining conservative amino acid substitutions in the various 
polypeptides described herein and for describing the various peptide and peptide analog 
compounds, the amino acids can be conveniently classified into two main categories - . 
hydrophilic and hydrophobic- depending primarily on the physical-chemical characteristics 
of the ammo acid side chain. These two main categories can be further classified into 
subcategories that more distinctly define the characteristics of the amino acid side chains : 
For example, the class of hydrophilic amino acids can be further subdivided into acidic basic 
and polar amino acids. The class of hydrophobic amino acids can be further subdivided into 
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apolar and aromatic amino acids. The definitions of the various categories of amino acids are 
as follows: 

"Hydrophilic amino acid" refers to an amino acid exhibiting a hydrophobicity of less 
than zero according to the normalized consensus hydrophobicity scale of Eisenberg et al 
1984, J. Mol. Biol. 179:125-142. Genetically encoded hydrophilic amino acids include Thr 
CT), Ser (S), His (H), Glu (E), Asn (N), Gin (Q), Asp (D), Lys (K) and Arg (R). 

"Acidic amino acid" refers to a hydrophilic amino acid having a side chain pK value 
of less than 7. Acidic amino acids typically have negatively charged side chains at 
Physio logical P H due to loss of a hydrogen ion. Genetically encoded acidic amino acids 
include Glu (E) and Asp (D). 

"Basic amino acid" refers to a hydrophilic amino acid having a side chain pK value of 
greater than 7. Basic amino acids typically have positively charged side chains at 
physiological pH due to association with hydronium ion. Genetically encoded basic amino 
acids include His (H), Arg (R) and Lys (K). 

"Polar amino acid" refers to a hydrophilic amino acid having a side chain that is 
uncharged at physiological P H, but which has at least one bond in which the pair of electrons 
shared m common by two atoms is held more closely by one of the atoms. Genetically 
encoded polar amino acids include Asn (N), Gin (Q) Ser (S) and Thr (T). 

"Hydrophobic amino acid" refers to an amino acid exhibiting a hydrophobicity of 
greater than zero according to the normalized consensus hydrophobicity scale of Eisenberg 
1984, J. Mol. Biol. 179:125-142. Genetically encoded hydrophobic amino acids include Pro 
(P), He (I), Phe (F), Val (V), Leu (L), Trp (W), Met (M), Ala (A), Gly (G) and Tyr (Y). 

"Aromatic amino acid" refers to a hydrophobic amino acid with a side chain having at 
least one aromatic or heteroaromatic ring. The aromatic or heteroaromatic ring may contain ' 
one or more substituents such as -OH, -SH, -CN, -F, -CI, -Br, -I, -NO,, -NO, -NH 2> -NHR 
-NRR, -C(0)R, -C(0)OH, -C(0)OR, -C(0)NH 2 , -C(0)NHR, -C(0)NRR and the like where 
each R is independently (G.-CJ alkyl, substituted (C ( -C 6 ) alkyl, ( Cj -C 6 ) alkenyl, substituted 
(C 2 -C s ) alkenyl, (C_,-C 6 ) alkynyl, substituted (C : C 4 ) alkynyl. (C S -C J0 ) aryl, substituted (C 5 -G 0 ) 
aryl, (Q-C,) arylalkyl, substituted (C 6 -C 26 ) aryialkyl, 5-20 membered heteroaryl, substituted 
5-20 membered heteroaryl, 6-26 membered heteroarylalkyi or substituted 6-26 membered 
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heteroarylalkyl. Genetically encoded aromatic amino acids include His (H), Phe (F), Tyr (Y) 
and Tip (W). 

. "Polar amino acid" refers to a hydrophobic amino acid having a side chain that is 
uncharged at physiological pH and which has bonds in which the pair of electrons shared in 
common by two atoms is generally held equally by each of the two atoms (/.<?., the side chain 
is not polar). Genetically encoded apolar amino acids include Leu (L), Val (V), lie (I), Met 
(M), Gly (G) and Ala (A). , ........... . . , . . . \ 

"Aliphatic amino acid" refers to a hydrophobic amino acid having an aliphatic 
hydrocarbon side chain. Genetically encoded aliphatic amino acids include Ala (A), Val (V), 
Leu (L) and lie (I). 

"Hydroxyl-substituted aliphatic amino acid" refers to a hydrophilic polar amino acid 
having a hydroxyl-substituted side chain. Genetically-encoded hydroxyl-substituted aliphatic 
amino acids include Ser (S) and Thr (T). 

Th e amino acid residue Cys (C) is unusual in that it can form disulfide bridges with 
other Cys (C) residues or other sulfanyl-containiiig amino acids. The ability of Cys (C) 
residues (and other amino acids with -SH containing side chains) to exist in a peptide in either 
the reduced free^SH or.oxidized disulfide-bridged fonrn affects whether Cys (C) residues 
contribute net hydrophobic or hydrophilic character to a peptide. While Cys (C) exhibits a 
hydrophobicity of 0.29 according to the normalized consensus scale of Eisenberg (Eisenberg, 
1 984, supra), it is to be understood that for purposes of the present invention Cys (C) is 
categorized as a polar hydrophilic amino acid, notwithstanding the general classifications . 
defined above. 

As will be appreciated by those of skill in the art, the above-defined categories are not 
mutually exclusive. Thus, amino acids having side chains exhibiting two or more physical- 
chemical properties can be included in multiple categories. For example, amino acid side 
chains having aromatic moieties that are further substituted with polar substituents, such as 
Tyr (Y), may exhibit both aromatic hydrophobic properties and polar or hydrophilic 
properties, and can therefore be included in both the aromatic and polar categories. As 
another example, His (H) has a side chain that falls within the aromatic and basic categories. 
The appropriate categorization of any amino acid will be apparent to those of skill in the art, 
especially in light of the detailed disclosure provided herein. 
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While the above-defined categories have been exemplified in terms of the genetically 
encoded amino acids, the amino acid substitutions need not be, and in certain embodiments 
preferably are not, restricted to the genetically encoded amino acids. Indeed, since many of 
the compounds described herein may be produced synthetically, they may comprise one or 
more genetically non-encoded amino acids. Thus, in addition to the naturally occurring 
genetically encoded amino acids, amino acid residues in the core peptides of structure (I) may 
be substituted with naturally occurring non-encoded amino acids and synthetic amino acids. 

Certain commonly encountered amino acids of which the compounds of the invention 
may be comprised include, but are not limited to, ^alanine (3-Ala) and other omega-amino 
acids such as 3-aminopropionic acid, 2,3-diaminopropionic acid (Dpr), 4-aminobutyric acid 
and so forth; a-aminoisobutyric acid (Aib); e-aminohexanoic acid (Aha); 6-aminovaleric 
acid (Ava); N-methylglycine or sarcosine (MeGly); ornithine (Ora); citrulline (Cit); 
t-butylalanine (t-BuA); t-butylglycine (t-BuG); N-methylisoleucine (Melle); phenylglycine 
(Phg); cyclohexylalanine (Cha); norleucine (Nle); naphthylalanine (Nal); 4- 
chlorophenylalanine (Phe(4-Cl)); 2-fluoro P henylalanine (Phe(2-F)); 3-fluorophenylalanine 
(Phe(3-F)); 4-fluorophenylaIanine (Phe(4-F)); penicillamine (Pen); 1,2,3,4- 
tetrahydroisoquinoline-3-carboxylic acid (Tic); 0-2-thienylalanine (Thi); methionine 
sulfoxide (MSO); homoarginine (hArg); N-acetyl lysine (AcLys); 2,4-diaminobutyric acid 
(Dbu); 2,3-diaminobutyric acid (Dab); p-aminophenylalanine (Phe(pNH 2 )); N-methyl valine 
(MeVal); homocysteine (hCys), homophenylalanine (hPhe) and homoserine (hSer); 
hydroxyproline (Hyp), homoproline (hPro), N-methylated amino acids and peptoids (N- ' 
substituted glycines). 

The classifications of the genetically encoded and common non-encoded amino acids 
according to the categories defined above are summarized in Table 1, below. It is to be 
understood that Table 1 is for illustrative purposes only and does not purport to be an 
exhaustive list of amino acid residues that can be used in the invention. Additional amino 
acids may be found in Fasman, 1 989, Practical Handbook of Biochemistry and Molecular 
Biology, CRC Press, Inc., pp. 3-70, and'the references cited therein. 
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TABLE 1: CLASSIFICATIONS OF COMMONLY ENCOUNTERED AMINO ACIDS 



Classification 


Genetically 
Encoded 


Non-Genetically 
Encoded . 


Hydrophobic 






Aromatic 


H, F, Y, W 


Phg, Nal, Thi, Tic, Phe(4-Cl), Phe(2-F), 
Phe(3-F), Phe(4-F), hPhe 


Apolar 


l,v,i,m,g,a,p 


■ • t-BuA, t-BuG, Melle, Nic, MeVal, Cha, 
. McGly, Aib 


Aliphatic 


A, V, L, I 


b-Ala, Dpr, Aib, Aha, MeGly, t-BuA, 
• t-BuG, Melle, Cha, Nle, MeVal 


Hydrophilic 






Acidic 


D, E 




Basic 


H,K,R 


Dpr, Orn, hArg, Phe^-NH^), Dbu, Dab . 


Polar : - 


C/Q,N,S,T 


; • ■ Cit, AcLys, MSO, bAla, hSer 



As utilized hefSiiCthe term "pilus" or "pili ,r 'reiatcs to fibrillar heteropolymeric s 
structures embedded in the cell envelope of many tissue-adhering pathogenic bacteria, 
notably pathogenic gram negative badteria. In the present specification, the terms pilus and 
pili will be used interchangeably. A pilus is.composed of a number of "pilus subunits" which 
constitute distinct functional parts of the intact pilus. 

The term "chaperone" relates to a molecule which in living cells has the responsibility 
of binding to polypeptides in order to mature the polypeptides in a number of ways. Many 
-molecular chaperones are in vol ved-in the process of folding polypeptides intotheir native r 
conformations whereas other molecular chaperones are involved in the export out of or 
import into the cell of polypeptides. Specialized molecular chaperones are "periplasmic 
chaperones" which are bacterial molecular chaperones exerting their main actions in the 
"periplasmic space." Specialized periplasmic chaperones also have an immunoglobulin-like 
three dimensional structure. The periplasmic space constitutes the space in between the inner 
and outer bacterial membrane. Periplasmic chaperones are involved in the process of correct 
assembly of intact pili structures. When used herein, the use of the term "chaperone" 
designates a molecular, periplasmic chaperone unless otherwise indicated. 
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The phrase "preventing or inhibiting binding between pilus subunits and a periplastic 
chaperone" indicates that the normal interaction between a chaperone and its natural ligand 
i.e., the pilus subunit, is being affected either by being inhibited, expressed in another 
manner, or reduced to such an extent that the binding of the pilus subunit to the chaperone is 
measurably lower than is the case when the chaperone is interacting with the pilus subunit at 
conditions which are substantially identical (with regard to pH, concentration of ions, and 
other molecules) to the native conditions in the periplasmic space. Measurement of the 
degree of binding can be determined in vitro by methods known to the person skilled in the 
art (microcalorimetry, radioimmunoassays, enzyme based immunoassays, etc.). 

The phrase "preventing or inhibiting binding between pilus subunits" generally 
indicates that the normal interaction between pilus subunits is being affected either by being 
inhibited, expressed in another manner, or reduced to such an extent that the binding of a 
pilus subunit to another pilus subunit is measurably lower than is the case when the pilus 
subunits are interacting at conditions which are substantially identical (with regard to P H, 
concentration of ions, and other molecules) to the native conditions during pilus assembly 
This phrase can apply to the dissociation^ p^ interactions 
during pilus assembly. Measurement of the degree of binding can be determined in vitro by 
methods known to the person skilled in the art (microcalorimetry, radioimmunoassays, 
enzyme based immunoassays, etc.). 

The compounds and compositions of the present invention which prevent or inhibit 
binding berween pilus subunits or between a pilus chaperone or subunit are said to exhibit • 
"antibacterial activity." 

By the term "subject in need thereof is in the present context meant a subject, which 
can be any plant or animal, including a human being, who is infected with, or is likely to be 
infected with, tissue-adhering pilus-forming bacteria which are believed to be pathogenic. 

By the term "an effective amount" is meant an amount of the substance in question 
which will in a majority of patients have either the effect that thedisease caused by the 
pathogenic bacteria is cured or ameliorated or, if the substance has been given 
prophylactically, the effect that the disease is prevented from manifesting itself. The term "an 
effective amount" also implies that the substance is given in an amount which only causes 
mild or no adverse effects in the subject to whom it has been administered, or that the adverse 
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effects may be tolerated from a medical and pharmaceutical point of view in the light of the 
severity of the disease for which the substance has been given. 

As used herein, "treatment" includes both prophylaxis and therapy. Thus, in treating a 
subject, the compounds of the invention may be administered to a subject already harboring a 
bacterial infection or in order to prevent such infection from occurring. 

By the term "a mimic of a pilus subunit" is meant a compound which has been 
established to bind to a chaperone or to another pilus subunit in a manner which is 

comparable to the way the pilus subunit binds to the chaperone or to the way that the pilus 
subunits bind to each other, respectively. 

The terms "an analogue of a G, beta-strand of a periplastic chaperone" or "a mimic 
of a G, beta-strand of a periplasmic chaperone" denotes any substance which mimics or has 
the ability to bind to at least one pilus subunit in a manner which corresponds to the binding 
of a chaperone to a pilus subunit in the periplasmic space. Such an analogue or mimic of the 
chaperone can be a.modified form of the intact chaperone {e.g. one of the two domains Of 
PapD) or it can be a modified form of the chaperone which may e.g. be coupled to a probe, 
marker or another moiety. Another such analogue or mimic can be obtained by modifying or 
mutating the G, beta strand of the periplasmic chaperone so that it differs from the wild-type 
sequence by the substitution of at least one amino acid residue of the wild-type sequence with 
a different amino acid residue and/or by the addition and/or deletion of one or more amino 
acid residues to of from the wild-type sequence. The additions and/or deletions can be from 
an internal region of the wild-type sequence and/or at either or both of the N- or C-tennini. In 
the present context, the pilus subunit, mimic or analogue thereof exhibits at least one binding 
characteristic relevant for the assembly of pili. 

In the present context the terms "an analogue of a pilus subunit" and "a mimic of a 
pilus subunit" should be understood, in a broad sense, to mean any substance which mimics 
(with respect to binding characteristics) an effective part of a pilus subunit {e.g. the amino- 
terminal portion of the pilus subunit). Thus, the analogue or mimic may simply be any other 
compound regarded as capable of mimicking the binding between pilus subunits in vivo ox in 
vitro. In the present context, the pilus subunit, mimic or analogue thereof exhibits at least one 
binding characteristic relevant for the assembly of pili. 
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In the present context the terms "a mannose analogue" or "a mannose mimic" should 
be understood, in a broad sense to mean any substance which mimics (with respect to binding 
characteristics) the mannose sugar which binds to an effective part of the FimH adhesin (e.g. , 
the NH, terminal mannoscbinding domain). Thus, the analogue or mimic may simply be any 
other compound regarded as capable of mimicking the binding of a mannose-oligosaccharide 
to FimH adhesin in vivo or in vitro. In the present context, the mannose analogue or mannose 
mimic exhibits at.least one binding characteristic relevant for the adhesion of pihV - 

The terni "donor stand complementation" refers to the mechanism by which a 
chaperone donates its G, beta-strand to complete the fold of a pilus subunit. 

The term "donor strand exchange" refers to the mechanism by which the amino- 
terminal extension of a pilus subunit displaces the G, beta-strand of a pilus chaperone and 
subsequently occupies the subunit groove previously occupied by the G, beta-strand. 

The term "crystallized PapD-PapK chaperone-subunit co-complex" refers to a 
polypeptide co-complex having ah amino acid sequence as set out in SEQ ID NO: T and SEQ 
ID NO: 12 and which is in crystalline form. . . ' 

The term "crystal" refers to a composition comprising a polypeptide in crystalline 
form. The term "crystal" includes native crystals, heavy^atom ^derivative crystals and co- 
crystals, as defined herein. 

The term "native crystal" refers, to a crystal wherein the polypeptide is substantially 
pure. As used herein, native crystals do not include crystals of polypeptides comprising 
amino acids that are modified with heavy atoms, such as crystals of selenomethionine 
mutants, selenocysteine mutants, etc. 

The term "heavy-atom derivative crystal" refers to a crystal wherein the polypeptide is 
in association with one or more heavy-metal atoms. As used herein, heavy-atom derivative 
crystals include native crystals into which a heavy metal atom is soaked, as well as crystals of 
selenomethionine mutants and selenocysteine mutants. 

The term "co-complex" refers to a polypeptide in association with one or more 
additional polypeptides or other molecules. For example, the PapD-PapK and FimC-FimH 
assemblies are co-complexes. 
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The term "co-ciystal" refers to a composition comprising a co-complex, as defined 
, above, in crystalline form. Co-crystals include native co-crystals and heavy-atom derivative 
co-crystals. 

The term "unit cell" refers to the smallest and simplest volume element (/.«?., 
parallelpiped-shaped block) of a crystal that is completely representative of the unit or pattern 
of the crystal. The dimensions of the unit cell are defined by six numbers: dimensions a, b 
and c and angles a, P and y (Blundel etal, -1976, Protein Crystallography, Academic Press.). 
A crystal is an efficiently packed array of many unit cells. 

The phrase "having substantially the same three-dimensional structure" refers to a 
polypeptide that is characterized by a set of atomic structure coordinates that have a root 
mean square deviation (r.m.s.d.) of less than or equal to about 2 A when superimposed onto 
the atomic structure coordinates of Tables 4 or 5 when at least about 50% to 100% of the C a 
atoms of the coordinates are included in the superposition. 

Detailed Description nf th e Invention 
In accordance with the present invention, applicants have designed and fabricated 
compounds which mimic components of chaperones such asPapD and pilus subunits such as 
PapK, and which thereby function to interfere with pilus assembly. Specifically, applicants 
have devised compounds and methods which interfere with the binding of a chaperone or a 
pilus subunit to a pilus subunit which will thus interfere with the formation of intact pili, 
thereby reducing the capacity of bacteria to adhere to host epithelium. Further, applicants 
have devised compounds which interfere with the adhesion of FimH adhesin to mannose 
oligosaccharides located on the host epithelium thereby reducing the capacity of piliated 
bacteria to attach to and infect host tissues. Applicants have further demonstrated that 
prevention or inhibition of pilus assembly in Gram-negative pathogens can be accomplished 
in a number of ways. 

The co-crystal structure of PapD has been resolved and refined to a 2.0 angstrom 
resolution, revealing a molecule with two immunogiobulin-like domains oriented in an L 
shape to form a cleft at their interface. See A. Holmgren and C.E. Brenden, Nature, 342:248 
(1989). The chaperone cleft contains surface-exposed residues that are highly conserved. 
Each immunoglobulin-like domain has a beta-barrel structure formed by two antiparallel 
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beta-pleated sheets with an overall topology similar to an immunoglobulin fold. Applicants 
have resolved the co-crystal structure of the PapD-PapK chaperone-subunit co-complex 
which reveals how PapD stabilizes pilus subunits in the periplasm. Further, a combination of 
genetic, biochemical, and crystallographic data has demonstrated that the G, beta-strand of 
PapD forms a beta-zipper interaction with the highly conserved COOH-terminal motif of 
pilus subunits. See Hung, et aL, EMBO J. 15:3792 (1996); Kuehn et al., Science 262:1234 
(1993); Soto eUl., EMBO J. 17:6155 (1998). This COOH-terminal motif also comprises at 
least part of a primary surface for subunit-subunit assembly interactions, indicating that the 
direct capping of a primary assembly surface is part of the molecular basis by which 
periplasmic chaperones prevent the premature oligomerization of pilus subunits. In addition, 
it is believed that the beta-zipper interaction facilitates the folding of the subunit into a nati ve- 
like conformation via a template-mediated mechanism. 

Applicants have solved the three dimensional co-crystal structure of a FimC-FimH 
chaperone-adhesin co-complex from uropathogenic E. coir. See Choudhury "et al, Science ' 
285: 1061 (1999). This molecular mechanism is supported by this structure. Specifically, 
applicants have, demonstrated that in.the FimC-FimH co-complex, the seventh (G,) strand 
. fr ° m * e ^i^^^^dpmain of the chaperone is used to complement the pilin domain - 
between the second half of the A strand and the F strand of the domain. As such, the F strand 
of FimH forms a parallel beta-strand interaction with the G, beta-strand of FimC and has its 
COOH-terminal carboxyl group anchored in the crevice of the chaperone cleft of FimC. . 

Thus, applicants have elucidated the mechanism of binding between PapD and the 
pilus subunit PapK, thereby identifying an essential part of a defined binding site responsible 
for the "binding between pilus subunits as well as binding between pilus subunits and their 
periplasmic chaperones. Furthermore, applicants have utilized the PapD-PapK co-crystal 
structure, the first of such a co-complex, and the FimC-FimH co-crystal structure to provide 
further insights into the processes of subunit folding, capping, and assembly in the 
chaperone/usher pathway of pilus biogenesis, and thereby devised compounds, compositions 
and methods for the prevention and inhibition of pilus formation. 

Furthermore, applicants have elucidated the mannose binding domain of the FimH • 
adhesin which is responsible for mediating the binding of pili to mannose receptors on host 
cells. As demonstrated further in the examples, a pocket capable of accommodating a mono- 
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manner unit is located at the tip of the lectin domain of the FimH adhesin. Applicants have 
unhzed the identification of this mannose-binding site to design compounds and 
compositions which would function to interfere with pilus attachment to epithelial tissues 
thereby inhibiting or preventing the ability of the bacterium to infect host tissues. 



PapD-PapK Chaperone-Snh.m ft Co-Comp ly 

An important aspect of the PapD-PapK chaperone-subunit co-complex is the structure 
of the PapK subunit. PapK has an immunoglobulin-like fold; however, it lacks the canonical 
seventh beta-strand and in its place is a deep groove located on the surface of the PapK 
subunit. The base of the groove on the surface of the PapK subunit is formed by the 
hydrophobic core of the protein. From the resolved co-crystal structure of the PapD-PapK 
chaperone-subunit co-complex, it can be seen that the G, beta-strand of the chaperone 
occupies this groove and prevents the exposure of the hydrophobic core of the subunit, which 
would lead to the destabilization and ^degradation of the subunits. 

vMoreovel; thePapD-PapK chaperone-subunit co-complex provides further insight 
mto the mechanism by which pilus subunits assemble to form a mature, intact pilus. The 
eight amino acids located on the amino-terminus of PapK are disordered and presumably 
project away from the co-complex. These residues contain a pattern of alternating 
hydrophobic residues typical of a beta-strand which is conserved in pilus subunits. Thus 
while not being bound to a particular theory, it is believed that in the mature pilus, the amino- 
termmal residues of one subunit occupy the groove of the adjacent subunit. 

In the PapD-PapK co-complex structure, strand F of PapK forms one side of the 
groove into which the G, beta-strand of the chaperone is inserted and is likely to assume the 
same structural role in pilins. Structural, biochemical and genetic data have demonstrated 
that strand F (and hence the groove) in pilins is involved in both chaperone-subunit and 
subunit-subunit interactions. By donating a secondary structural element to the fold of the 
pmn, the chaperone not only contributes to the stability of the pilin but also prevents other 
pihns in the periplasm from binding to the groove of the chaperone-bound subunit. 

The amino-terminal region of pilins, corresponding to the disordered amino-terminus 
of PapK, has also been shown to form an assembly surface on the pilin. The eight NH,- 
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tenninal residues are disordered in the PapD-PapK co-complex and protrude away from the 
main body of the co-crystal structure where they would be free to interact with the groove of 
the preceding subunit located at the usher. The amino-terminus of an incoming subunit 
inserts into the groove of the preceding subunit, displacing the G, beta-strand of the 
chaperone in a mechanism that is facilitated by the usher. Applicants refer to this mechanism 
as "donor strand exchange". Donor strand exchange implies that in the pilus, the NH r 
terminal strand of one subunit.would complete the immunoglobulin-like fold and protect the 
hydrophobic core of the. preceding subunit, much as the chaperone does in the periplasm. 

A donor strand exchange model for pilus assembly employing a PapK structure was 
utilized to model aPapApilus rod.. Pilus rods are well-ordered helical structures with a 
diameter of 68 A, a pitch of 24.9 A, and 3.28 subunits per turn. The disordered NH 2 -terminus 
of PapK was modeled as a beta-strand protruding from the Ig fold at an angle consistent with 
the ordered portion of the NH 2 -terminus in the structure, and inserted into the groove of the . 
preceding subunit. A pilus rod with the appropriate general features and without steric 
clashes could be built by applying identical translation^ and rotational operations to- 
successive subunits. The model pilus has a 72 A diameter, a pitch of approximately 22 A, 
and approximately 3 .3 subunits per turn, similar to the actual dimensions of the pilus rod 
(Fig. 7). However, the model has an unexpected feature: the NH2-terminal strand of one 
subunit runs antiparallel (not parallel as does the G, beta-strand of PapD) to strand F of the 
preceding subunit. A parallel beta-strand interaction with strand F of the preceding subunit 
would produce a rod with a star-shaped cross-section (Figs. 7A and 7B), inconsistent with the 
electron microscopy data. Thus, while donor strand complementation with the chaperone 
results in an atypical immunoglobulin fold, donor strand exchange between subunits produces 
a canonical variable-region immunoglobulin fold in the mature pilus. . 

FimC-FimH chaperone-arih^. n co-comp ly 

Further evidence illustrating donor strand complementation is provided by the 
resolution of the co-crystal structure of the FimC-FimH chaperone-adhesin co-complex from 
uropathogenic E. coli. See Chdudhury, et al„ Science 285: 1061 (1999). The FimC-FimH 
chaperone-adhesin co-complex structure also reveals a donor strand complementation 
mechanism that explains the basis of both chaperone function and pilus biogenesis. 
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The FimH adhesin subunit is folded into two domains of the all-beta class, a NH 2 - 
terminal mannose-binding domain and a COOH-terminal pilin domain. A short extended 
linker (residues 157H - 159H) connects the two domains. The NH 2 -termirial mannose- 
binding domain comprises residues 1H - 156H, and the COOH-terminal pilin domain which 
is used to anchor the adhesin to the pilus comprises residues 160H - 279H (Figure 8A). The 
pilin domain of FimH binds in the cleft of the chaperone (Figure 9B) with limited contact 
between FimH and the COOH-terminal domain of FimC. 

The lectin domain of FimH is an eleven-stranded elongated beta-barrel with a jelly 
roll-like topology (Figure 8B). The fold starts with a short beta hairpin that it not part of the 
jelly roll. The final (eleventh) strand of the domain is inserted between the third and tenth 
strands and thus breaks the jelly-roll topology. A pocket capable of accommodating a mono- 
mannose unit is located at the tip of the domain, distal from the connection to the pilin 
domain (Figure 9B). The bottom of the pocket is lined with asparagine, glutamine and 
aspartic acid residues in three loop regions which are typical carbohydrate binding side chains 
(Figure 10A). These residues form hydrogen bonds with C-HEGA as described in Example 3 
herein. " ' 

- The pilin domain of FimH has the same immunbglobuiin-liice topology as the amino- 
terminal domain of FimC, except that the seventh strand of the fold is missing (Figure 8B). 
Two anti-parallel beta-sheets (strands A'BED' and D"CF) pack against each other to form a 
beta-barrel that is similar to, but distinct from, immunoglobulin barrels. As in the 
-chaperones, strand switching occurs at the edges of the sheets. In the chaperones, the Al 
sjfr^^^^^inq-termim^ sheetS-of the barrel; -The first 

strand of the pilin domain exhibits a similar switch, but due to the lack of a seventh strand, 
the second half of the A strand is not involved in main chain hydrogen bonding within the 
domain. The D strand of the chaperones as well as of the FimH pilin domain also switches, 
but in the pilin domain the switch is an eight-residue loop instead of the cis-proline bulge 
found in the chaperones. The C-D loop and the D'-D" connection pack against each other 
and close the top of the barrel. The other side of the barrel, defined by the A and F edge 
strands, is open. Due to the absence of a seventh strand a deep scar is created on the surface 
of the domain. Residues that would be part of the hydrophobic core of an intact, seven 
stranded PapD-like domain instead line a deep hydrophobic crevice on the surface of the pilin 
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domain (Figure 10B). 

As mentioned herein, the donor strand complementation mechanism refers to the 
chaperone donating its G, beta-strand to complete the fold of the pilin domain. The G, beta- 
strand of periplastic chaperones such as FimC and PapD contains a conserved motif of 
solvent-exposed hydrophobic residues at positions 103, 1 05 and 107. In the chaperone- 
subunit co-complex, the G, beta-strand containing these alternating hydrophobic residues are 
used to complete the unfinished hydrophobic core of pilus subunits such as FimH and PapD. 
Thus, in the FimC-FimH co-complex, these hydrophobic residues are used to complete the 
unfinished hydrophobic core of FimH which results from the missing seventh strand. 
Specifically, the seventh (G.) strand from the NH 2 -terminal domain of the FimC chaperone 
complements the FimH pilin domain by being inserted between the second half of the A 
strand and the F strand of the domain (Figure IOC). Leu' 03C and Leu ,05C are deeply buried in 
the crevice in the FimH pilin domain. Leu ,03C of FimC contacts residues Ile ,8IH , Val 223H , 
Leu 223H and lie 272 " of FimH. Leu ,MC of FimC is in contact with He' 8 " 1 , Leu 252H , He 2 ™ and 
Val 274H of FimH. He' 07 is closer to the FimH pilin domain surface but mades van der Waals 
contacts with residues Val'«« and Phe 2 ™. The final strand (F) of FimH forms a parallel beta- 
strand interaction with the C[ beta-strand of FimC and has its COOH-terminal carboxylate 
group anchored in the crevice of the chaperone cleft through hydrogen bonding with the 
conserved residues Arg» c and Lys' ,2C in FimC (Figure 9A). This interaction is critical for . 
chaperone function. 

Furthermore, the two conserved motifs of FimH (the COOH-terminal F strand and an 
amino-terminal motif) participate in subunit-subunit interactions necessary for pilus 
assembly. SeeQE. Soto et al,EMBOJ., 17:6155 (1998). An alignment of the pilin 
sequences demonstrates that the amino-terminal motif of FimC was part of a 10-20 residue 
NHj-teimirial extension that was missing in the FimH pilin domain (Figure 8A) and 
disordered in the PapD-PapK co-complex as discussed above. This region contains a highly 
conserved pattern of alternating hydrophobic residues (highlighted in Figure 8A) similar to • 
the donor G, beta-strand of the chaperone. Applicants believe that the amino-terminal 
extension of the FimH subunit is structurally analogous to the donor G, beta-strand motif of 
the chaperone and thus, would fit into the pilin groove occupied by the donor G, beta-strand 
of the chaperone. 
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However, the type 1 pilus is a right handed helix with about 3 subunits per turn, a 
diameter of approximately 70 A, a central pore of about 20r25 A, and a rise per subunit of 
about 8 A. Thus, in order to obtain this structure, the insertion of the NH 2 -terminal extension 
must be antiparallel to strand F in contrast to the parallel insertion observed for the G, beta- 
strand of the chaperone. Insertion in a parallel orientation would lead to rosette-like , 
structures. One edge of the pilin groove is lined by the COOH-terminal F strand and forms a 
critical part of the subunit tail, Thus, without being bound to any theory, Applicants believe 
that the amino-terminal extension represents the head of a subunit and during pilus 
biogenesis, the amino-terminal extension would displace the donor G, beta-strand of the 
chaperone to fit into the tail groove of a neighboring subunit to complete the pilin fold of its 
neighbor in a donor strand complementation mechanism. 

Applicants constructed a model for the type 1 pilus using the FimH pilin domain as a 
model for FimA (Figure 11). Each subunit was aligned to have its cleft facing towards the . 
* center of the piliis so that "the height from the top to the bottdm of the domain along the helix 
axis was approximately 25 A. Applying a rotation of. 1:1 5 degrees and a rise per subunit of 8 
A, a hollow helical cylinder is created. The outer diameter of this cylinder as measured across 
C a atoms is JO A; and the inner diameter is 25 A. FimA subunits from different strains of E. 
coli exhibit considerable allelic variation. The vast majority of the variable positions are on 
the outside surface of the pilus model described above (Figure 1 1) which would account for 
the antigenic variability of type I pili. 

The head-to-tail interaction between subunits in a pilus is reminiscent of 
oligomerization through three dimensional domain swapping in the sense that a part of the 
molecule is used to complement another. However, in this case, complementation occurs not 
only between identical protein chains (FimA in the pilus rod) but also between homologous 
but distinct chains e.g., FimG, FimF and FimH in the pilus tip. Furthermore, because 
individual pilins promoters do not exist as stable monomers, there is no exchange of 
structural units between a monomelic and an oligomeric state. Instead, a different protein, the 
periplasmic chaperone, is needed to keep the monomelic subunits in solution by donating a . 
unique part of its structure (the G, beta-strand) to the different subunit grooves. 

Based on the structure of the FimC-FimH co-complex and without being limited to 
any theory, it is believed that pilins are missing necessary steric information needed to fold 



W0 01/10386 PCT/USOO/22087 

29 

into a native three dimensional structure. The information that is missing consists of the 
seventh edge strand of an immunoglobulin fold. This strand, which is necessary for folding, 
is donated to the hydrophobic core of the pilin by the periplasmic chaperone in a donor strand 
complementation mechanism. 

Applicants further utilized the co-crystal structure of the FimC-FimH chaperone- 
adhesin co-complex to identify the anino-termina! mannose-binding domain of FimH, an 
essential component required for pilus adhesion to host tissues. As discussed above, the 
bottom of this mannose-binding domain is lined with asparagine, glutamine and aspartic acid 
residues and those skilled in the art would be able to use molecular modeling techniques and 
other existing protocols to design and synthesize antibacterial compounds. Such compounds 
would compete with mannose for binding to the FimH adhesin thereby preventing or 
inhibiting pilus adhesion to host epithelium. 

Thus, applicants utilized the discovery of this molecular mechanism of protein 
binding to identify an, essential part of a defined binding site responsible for pilus assembly 
and adhesion. Further, applicants have utilized this structure to design and fabricate methods 
and compounds to compete with the .chaperone for binding to the exposed binding site of the 
pilus subunit thereby inhibiting pilus assembly and reducing the pathogenicity of filiated 
Gram-negative bacterium. Such a compound is useful in treating bacterial diseases or in 
preventing costly biofilm formation in medical, industrial and various other settings. 

Peptide compounds 

Thus, the present invention is directed to compounds which mimic the capability of a 
periplasmic chaperone or of a pilus subunit to bind to the groove of a pilus subunit, thereby 
preventing or inhibiting pilus biogenesis by interfering with the normal function of these 
biological components. Specifically, applicants have shown that prevention or inhibition of 
the binding between pilus subunits and between pilus subunits and periplasmic chaperones 
can be accomplished in a number of ways. 

In a preferred embodiment of the invention, the compounds are peptides or peptide ■ 
analogs that are capable of disrupting the assembly of pilus subunits and/or binding the cleft 
of a pilus subunit that is bound by the G. beta-strand of another pilus subunit in an assembled 
p.li structure and comprise a core sequence of residues preferably derived from a conserved 
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N-terminal region of a pilus subunit. As will be apparent from alignments of the conserved 
N-terminal regions of the various pilus subunits, such peptides and peptide analogs will 
typically comprise at least two alternating hydrophobic amino acids. The core sequence of 
such peptides and peptide analogs may be derived from the amino-terminal sequence of any 
of a number of pilus subunits, including but not limited to, Pap A, PrsA, FimA, AfaA, Foe A, 
HifA, HafA, Fim2, Fim3, MrpA, PmfA, LpfA, PefA, ArfA, PapK, PrsK, PapH, PrsH, PapE, 
PrsE, MrpB, SfaG, SfaS, FocG, FocF, PapF, PrsF, MprF, MrpE, F17A, FanC, FaeA, MrkA 
and RalC. Typically, the core sequence is composed of about 3 to about 12 residues, 
preferably 5 to 9, most preferably 7 residues. The core sequence may correspond identically 
to the sequence of a pilus subunit, or it may include one or more substitutions, preferably 
conservative substitutions; and/or insertions and/or deletions. 

Moreover, the core sequence may be flanked at either of both of its N- and/or C- 
termini by residues of random sequence (i.e., sequences that do not necessarily correspond to 
the pilus subunit from which the core sequence is derived). When included, such flaiikine 
residues should not significantly alter the ability bf the core sequence to disrupt subunit 
assembly. Thus, typically the compounds of the invention will include fewer than 5 flanking 
residues at each terminus, preferably fewer than 3 flanking residues, and most preferably no 
flanking residues. 

Further, the peptides and/or peptide analogs may comprise hybrid sequences. For 
example, the peptide or peptide analog may include a core sequence derived from PapA 
flanked at one or both termini with sequences derived from FimA. Alternatively, the peptide 
. or P e P ridc ati alog may include a core sequeno? of, for ex&npjte W 
are, for example, derived from PapA and the rest of which are, for example, derived from 
FimA. 

In one illustrative embodiment, the compounds are 10 to 20 residue peptide and/or 
peptide analogs comprising formula (I): 

(I) X,— X 2 -X 3 -X 4 -X 5 -X 6 — X 7 — X 8 -X9— x, 0 

or a pharmaceutically-acceptable salt thereof, wherein: 

X, is any amino acid residue, preferably other than a basic residue; 
X 2 is any amino acid residue, preferably other than a aliphatic residue; 
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X, is a hydrophobic residue, preferably an aliphatic residue or a hydroxyl- 
. substituted aliphatic residue; 

X 4 is any amino acid residue, preferably other than an acidic residue; 

Xj is a hydrophobic residue or Gly; 

X 6 is a hydrophobic or a hydrophilic residue; 

X 7 is a hydrophobic residue, preferably Gly, an amide-substituted polar residue 
or an aliphatic residue, and most preferably Gly; 

X„ is any amino acid residue, preferably other than an aliphatic residue; 
X, is an aliphatic residue; and 

X 10 is any amino acid residue, preferably a hydrophobic residue, more 
preferably an aliphatic residue or a polar residue. 

In the compounds comprising formula (I), the symbol V between residues X„ 
generally designates a backbone constitutive linking function. Thus, when the compounds 
are peptides, the symbol represents a peptide or amide linkage (-C(O)NH-). It is to be 
understood, however, that formula (I) includes peptide analogs in which one or more amide 
linkages is optionally replaced with a linkage other than amide linkage, preferably a 
substituted amide or an isostere of amide linkage. Thus, while the various X, residues within 
formula (I) may conveniently be described in terms of "amino acids" or "residue," those 
having skill in the art will recognize that in embodiments having non-amide linkages, the 
term "amino acid" or "residue" as used herein refers to other Afunctional moieties bearing 
side-chain groups similar in structure to the side chains of the amino acids. 

Substituted amide linkages generally include, but are not limited to, groups of the 
formula -C(0)N(R)-, where R is (C,-C 6 ) alkyl, substituted (C,-C s ) alkyl, (C-CJ alkenyl, 
substituted (C 2 -C 6 ) alkenyl, (C r C 4 ) alkynyl, substituted (C 2 -C 4 ) alkynyl, (Cj-C :o ) aryl, 
substituted (C 5 -C J0 ) aryl, (C 6 -C 26 ) arylalkyl, substituted (Q-C :6 ) arylalkyl, 5-20 membered 
heteroaryl, substituted 5-20 membered heteroaryl, 6-26 membered heteroarylalkyl and 
substituted 6-26 membered heteroarylalkyl. 

Isosteres of amide linkages generally include, but are not limited to, -CH 2 NH V 
-CH 2 S-, -CH 2 CH 2 -, -CH=CH- (cis and trans), -C(0)CH 2 -, -CH(OH)CH 2 - and -CH 2 SO-. 
Compounds having such non-amide linkages and methods for preparing such compounds are 
well-known in the art (see, e^, Spatola, March 1983, Vega Data Vol. 1, Issue 3; Spatola, 
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1983,- "Peptide Backbone Modifications" /«: Chemistry and Biochemistry of Amino Acids 
Peptides and Proteins, Weinstein, ed., Marcel Dekker, New York, p. 267 (general review); 
Moriey, 1980, Trends Pharm. Sci. 1:463-468; Hudson et ah, 1979, Int, J. Prot. Res. 14:177- 
185 (-CH 2 NH-, -CH 2 CH 2 -); Spatola et a!., 1986, Life Sci. 38:1243-1249 (-CH 2 -S); Hann, 
1982, J. Chem. Soc. Perkin Trans. 1. 1:307-31 4 (-CH=CH-,cis and trans); Almquist e^/., 
1980, J. Med. Chem. 23:1392-1398 (-COCH,-); Jennings-White et al. y Tetrahedron. Lett. 
23:2533 (-COCH 2 -); European Patent Application EP 45665 (1982) CA 97:39405 . 
(-CH(OH)CH,-); Holladay et a/., 1983, Tetrahedron Lett. 24:4401-4404 (-C(OH)CH r ); and 
Hruby, 1982, Life Sci. 31:189-199 (-CH 2 -S-). 

Additionally, one or more amide linkages can be replaced with peptidomimetic or 
amide mimetic moieties which do not significantly interfere with the structure or activity of 
the peptides. Suitable amide mimetic moieties are described, for example, in Olson et al., 
1993, J. Med. Chem. 36:3039-3049. 

: Compounds comprising formula (I) that are peptide analogs may provide significant 
therapeutic advantages, as their non-peptide interlinkages may confer the compound with 
enhanced stability towards proteases and/or~peptidases, thereby conferring the.compounds 
with increases in yi vo stability compared to a corresponding peptide* 

The various residues X! through X l0 may be selected from amongst the genetically 
encoded amino acids, as well as from genetically non-encoded amino acids. Moreover, the 
residues may be in either the D- or L- configuration, as long as the compound retains activity. 
_ Compounds including D-amino acids may have enhanced in vivo stability. Preferably, all of 
residues X l through X l0 are in the L-configuration. 

"^peptides and peptide analogs of formula Qlnay opdonan fn addition to 

the sequence defined by residues X! through X 10 , a 1 to 5 residue peptide or peptide analog at 
either or both termini. Peptide analogs typically contain at least one modified interlinkage, 
such as a substituted amide or an isostere of an amide, as described above. Such additional 
peptides or peptide analogs may have an amino acid sequence derived from a pilus subunit or, 
alternatively, their sequences may be completely random. Compounds including such 
random sequences may be tested for biological activity in the various assays and methods 
described in- a later section. 

The residues which comprise such additional peptides or peptide analogs may be 
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genetically encoded or non-encoded, and may be in either the D- or L-configuration. In one 
embodiment, when the sequence defined by formula (I) is a peptide, one or both termini are 
"capped" with 1 to 5 residue peptides composed wholly of D-amino acids that serve to protect 
the core sequence from degradation in vivo by proteases and/or peptidases. 

Also included within the scope of the present invention are "blocked" forms of the 
peptides and peptide analogs including formula (I), i.e., 10 to 20 peptides and/or peptide 
analogs in which the N- and/or C-terminus is blocked with a moiety capab le of reacting with 
the N-terminal -NH 2 or C-terminal -C(0)OH. Such blocked compounds are typcially 
N-terminal acylated and/or C-terminal amidated or esterified. Typical N-terminal blocking 
groups include R'C(0>, where R 1 is hydrogen, (C..Q) alkyl, (C r Q) alkenyl, (Q-Q 
alkynyl, (C r C 20 ) aryl, (C 6 -C 26 ) arylalkyl, 5-20 membered heteroaryl or 6-26 membered 
heteroarylalkyl. Preferred N-terminal blocking groups include acetyl, formyl and dansyl. 
Typical C-terminal blocking groups include -C(0)NR'R' and -C(0)OR', where each R' is 
independently as defined as above. Preferred C-terminal blocking groups include those in 
which each R' is independently (C,-C 6 ) alkyl, preferably methyl, ethyl, propyl or isopropyl. 

Preferred amongst the 10 to 20 residue peptides and/or peptide analogs comprising 
formula (I) are those compounds having one or more or the following characteristics: - 
X 3 is an aliphatic residue or T; 
Xj is an aliphatic residue, F or G; and/or 
X 7 isG,HorA. 



Particularly preferred are the 10-residue peptides described in Table 2, below. 
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Table2:SUBUNIT N-TERMINAL-MOTIF-DERIVED PEPTIDES 





AiVHMO ACID SEQUENCE 


PILUS SUBUNIT 


GKYTFNGTVV (SEQ ID NO: 2) 


PapA,PrsA 


GTVHFKGEVV (SEQ ID NO: 3) 


FimA, SfaA, FocA 


GKVTFFGK W (SEQ ID NO: 4) " 


HifA, HafA 


GTIVITGTIT (SEQ ID NO: 5) 


Rim?" 


GTIVITGSIS (SEQ ID NO: 6) 


r iinj 


GTVKFVGSII (SEQ ID NO: 7) 


MrpA 




GEIQLKGEIV (SEQ ID NO' 8) 


Pm fA 


■; 


GTIKFTGEIV (SEQ ID NO- 9) 


T nfA 


NEVTFLGSVS (SEQ ID NO" 10) 


P^f A 


GTINFEGS W (SEQ ID NO: 1 1) 




SDVAFRGNLL (SEQ ID NO- 12) 


^ apis., rrsK 


GRAAFHGEW (SEQ ID NO- 13) 


r apn 


GRATFHGEVV (SEQ ID NO- 14) 


PrcU 


DNLTFRGKLI (SEQ ID NO- 15) 


rapb 


DNLTFKGKLI (SEQ ID NO- 16) 


PreT? 


GWLNLQGTIL (SEQ ID NO • 1 7) 






Sy_VNITGNVQ-(SEO-ID-NO-- 18) 


olavj 




TTITVTGNVL (SEO ID NO- 1 9) 






TTITVTGRVL (SEO ID NO- 20) 


rOCu 




UMLAUSNFVT (SEQ ID NO: 21) 


FocF 




VQINIRGNVY (SEQ ID NO: 22) 


PapF, PrsF 




PNLKJLFGTLL (SEQ ID NO: 23) " 1 


MrpF 


i 


VYINITGNVI (SEQ ID NO: 24) ." \ 


MrpE 




jKITFNGKVV (SEQ ID NO: 25) ' 1 


: 17A 



WO 01/10386 



I 

PCT/US00/22087 



35 



GTINFNGKIT (SEQ ID NO: 26) 


FanC 


QKTIFSADVV (SEQ ID NO: 27) 


FaeA 


GQVNFFGKVT (SEQ ID NO: 28) 


MrkA 


QRTIITADW (SEQ ID NO: 29) 


RalC 



In a preferred embodiment of the invention, the compounds are peptides or peptide 
analogs that mimic the binding activity of the G, beta-strand of a chaperone and that exhibit 
antibacterial activity against a Gram-negative bacterium. The core sequence of such peptides 
and peptide analogs may be derived from the G, beta-strand of any of a number of 
chaperones, including but not limited to, PapD, MrpD, FanE, SfaE, FaeE, MrkB, Hiffi, F17D, 
FimC, FimB, PefD, EcpD, ClpE, YehC, PmfF, FocC, LpfB, Seffi, CaFlM, CS3-1, CsaB 
MyfB, AggD, CssC, NfaA and AfaB. Typically, the core sequence is composed of about 3 to 
about 12 residues, preferably from 4 to 9 residues and most preferably 7 residues. The core 
sequence may correspond identically to the G, beta-strand sequence of a chaperone, or it may 
include one or more substitutions, preferably conservative substitutions, and/or insertions 
and/or deletions. 

Moreover, the core sequence may be flanked at either of both of its N- and/or C- 
termini by residues of random sequence (<.*., sequences that do not necessarily correspond to 
the G, beta-strand from which the core sequence is derived). When included, such flanking 
residues should not significantly alter the ability of the core sequence to mimic the binding 
activity of the G, beta-strand of a chaperone. Thus, typically the compounds of the invention 
will include fewer than 5 flanking residues at each terminus, preferably fewer than 3 flanking 
residues and most preferably no flanking residues. 

Further, the peptides and/or peptide analogs may comprise hybrid sequences. For 
example, the peptide or peptide analog may include a core sequence derived from the G, beta- 
strand of a PapD chaperone flanked at one or both termini with sequences derived from an 
MrpD chaperone. Alternatively, the peptide or peptide analog may include a core sequence 
of, for example 7 residues, some of which are, for example, derived from a PapD chaperone 
and the rest of which are derived from, for example a FanE chaperone. 

In one illustrative embodiment, the compounds are 7 to 1 1 residue peptide and/or 
peptide analogs comprising formula (II): 
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(») . . x ll -x 12 -x IJ -x 14 -x l5 -x 14 -x l7 

or a pharmaceutically-acceptable salt thereof, wherein: 

X„ is any amino acid residue, preferably other than a basic residue; 
X 12 is any amino acid residue; 

X I3 is a hydrophobic residue, preferably an aliphatic residue or an apolar 
residue, wherein the apolar residue is preferably M; 

X u is any amino acid residue, preferably other than an aromatic residue; 

X 15 is a hydrophobic residue, preferably an aliphatic residue; 

X, 4 is any amino acid residue, preferably an aliphatic residue or a hydroxyl- 
substituted aliphatic residue; and 

X I7 is hydrophobic residue or a hydroxyKsubstituted aliphatic residue, 
preferably an aliphatic residue, F, M or a hydroxyl-substituted aliphatic residue. 

. 'In the compounds comprising (II), the Symbol beWeen residues X, is as 
previously defined for formula. (I). . ; . . . . 



encoded amino acids, as well as from genetically non-encoded amino acids. Moreover, the 
residues may be in either the D- or L- configuration, as long as the compound retains activity. 
Compounds including D-amino acids may have enhanced in vivo stability. Preferably, all of 
residues X n through X 17 are in the L-configuration. 

Th l ^ ?eptids ^^ offormulaJIQ may opdonally include, jn .addition ,. 
to the sequence defined by residues X„ through X 17 , a 1 to 5 residue peptide or peptide analog 
at either or both termini. Peptide analogs typically contain at least one modified interlinkage, 
such as a substituted amide or an isostere of an amide, as described above. Such additional ' 
peptides or peptide analogs may have an amino acid sequence derived from the G ( beta-strand 
of a chaperone or, alternatively, their sequences may be completely random. Compounds 
including such random sequences may be tested for biological activity in the various assays 
and methods described in a later section. 

The residues which comprise such additional peptides or peptide analogs may be 
genetically encoded or non-encoded, and may be in either the D- or L-configuration; In one 
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convenient embodiment, when the sequence defined by formula (II) is a peptide, one or both 
termini are "capped" with 1 to 5 residue peptides composed wholly of D-amino acids that 
serve to protect the core sequence from degradation in vivo by proteases and/or peptidases. 

Also included within the scope of the present invention are "blocked" forms of the 
peptides and peptide analogs including formula (II), as previously described in connection 
with compounds comprising formula (I). 

Preferred amongst the 7 to 1 7 residue peptides and/or peptide analogs comprising 
formula (II) are those compounds having one or more or the following characteristics: 

X l3 is an aliphatic residue or M; 

X, 5 is an aliphatic residue, F or M; and/or 

X l7 is an aliphatic residue, F, M or T. 



Particularly preferred are the 7-residue peptides described in Table 3, below. 
Table 3: CHAPERONE G, BETA-STRAND-DERIVED PEPTIDES 



AMINO AGID SEQUENCE 


CHAPERONE 


NVLQIAL (SEQIDNO: 1) 


PapD, MrpD 


GSLSLAI (SEQ ID NO: 30) 


FanE 


NYLQFAI (SEQ ID NO: 31) 


SfaE 


SGIAVAL (SEQ ID NO: 32) 


FaeE 


NILQLAI (SEQ ID NO: 33) 


MrkB 


SFMQIAI (SEQ ID NO: 34) 


HifB 


NYLQFAV (SEQ ID NO: 35) 


F17D 


NTLQLAI (SEQ ID NO: 36) 


FimC 


GVLQLTI (SEQ ID NO: 37) 


FimB 


NVLAVAV (SEQ ID NO: 38) 


PefD 


SLLQLAF (SEQ ID NO: 39) 


EcpD 


SGIAVAV (SEQ ID NO: 40) 


ClpE 


NALKFAM (SEQ ID NO: 41) 


YehC 


NVLQMAM (SEQ ID NO: 42) 


PmfD 
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NYLQFAI (SEQ ID NO: 43) 


FocC 


NVLQIAV (SEQ ID NO: 44) 


LpfB 


LNVNWT (SEQ ID NO: 45) 


SefB 


VFVQFAI(SEQIDNO:46) 


CaflM 


MKLNVSI (SEQ ID NO: 47) 


CS3-1 


MDIQMSI (SEQ ID NO: 48) 


PsaB 


LNILLSV(SEQIDNO:49) 


MyfB 


MNIQVSV (SEQ ID NO: 50) 


AggD 


DSINISI(SEQIDNO:51) 


CssC 


LNVQLSV (SEQ ID NO: 52) 


NfaA,AfaB ■ 



Deletions of residues from either terminus of the peptides and peptide analogs of : 
formula (I) or (II) are also contemplated to be within the scope of the invention. Such 
deletions consist of the removal of one or more amino acids oflhe peptide sequence, with the 
lower limit length of the resulting peptide sequence being 3 to 7 amino acids, preferably 3 to 
5 amino acids. Such deletions may involve a single contiguous or greater than one discrete 
portion of the peptide sequences. One or more such deletions may be introduced into the „ 
sequence, as long as such deletions result in peptides which may still bind in whole, or in 
part, to a pilus subunit and consequentially prevent or inhibit pilus biogenesis. 

It will be appreciated that by virtue of the present invention, the above-described 
polypeptides. can.be synthesized.usingxonventional synthesis-procedures commonly used by 
one skilled in the art. For example, the polypeptides can be chemically synthesized using an 
automated peptide synthesizer (such as one manufactured by Pharmacia LKB Biotechnology 
Co., LKB Biolynk 4170 or Milligen, Model 9050 (Milligen, Millford, MA)) following the 
method of Sheppard, et al., Journal of Chemical Society Perkin I, p. 538 (1981). In this 
procedure, tyN'-dicyclohexylcarbodiimide is added to amino acids whose amine functional 
groups are protected by 9-flourenylmethoxycarbonyl (Fmoc) groups and anhydrides of the 
desired amino acids are produced. These Fmoc-amino acid anhydrides can then be .used for 
peptide synthesis. A Fmoc-amino acid anhydride corresponding to the C-terminal amino acid 
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residue is fixed to Ultrosyn A resin through the carboxyl group using dimethylaminopyridine 
as a catalyst. Next, the resin is washed with dimethylformamide containing piperidine, and 
the protecting group of the amino functional group of the C-terminal acid is removed. The 
next amino acid corresponding to the desired peptide is coupled to the C-terminal amino acid. 
The deprotecting process is then repeated. Successive desired amino acids are fixed in the 
same manner until the peptide chain of the desired sequence is formed. The protective groups 
other than the acetoamidomethyl are then removed and the peptide is released with solvent. 

Alternatively, the polypeptides can be synthesized by using nucleic acid molecules 
which encode the peptides of this invention in an appropriate expression vector which include 
the encoding nucleotide sequences. Such DNA molecules may be readily prepared using an 
automated DNA sequencer and the well-known codon-amino acid relationship of the genetic 
code. Such a DNA molecule also may be obtained as genomic DNA or as cDNA using 
oligonucleotide probes and conventional hybridization methodologies. Such DNA molecules 
may be incorporated into expression vectors, including plasmids, which are adapted for the 

expression of me DNA and production of me polypeptide in a suitable host such as 
bacterium, e.g. , Escherichia coli, yeast cell or mammalian cell. 

It is known that certain modifications can be made without completely abolishing the 
polypeptide's antibacterial activity. Modifications include the removal and addition of amino 
acids. Polypeptides containing other modifications can be synthesized by one skilled in the ' 
art and compounds comprising such polypeptides may be tested for biological activity in the * 
various assays and methods described; in a later section. Thus, the effectiveness of the 
polypeptides can be modulated through various changes in the amino acid sequence or 
structure. 

Further, it should be understood that the mimic may be modified using methods 
known in the art to improve binding, specificity, solubility, safety, or efficacy. A necessary 
characteristic of these preferred compounds is the capability to interact with at least one pilus 
subunit during transport of these pilus subunits through periplasmic space and/or during the 
process of assembly of the intact pilus, in such a manner that pilus biogenesis is prevented or 
inhibited. The compound can be any compound, preferably a peptide, which has one of the 
above effects on pilus subunits and thereby on the assembly of an intact pilus. 

Morever, the present invention is directed to a compound which will mimic the 
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capability of mannose to bind to the mannose binding site at the tip of the FimH adhesin, 
thereby preventing or inhibiting the ability ofthe pilus to adhere and infect host tissues. As 
discussed above, the bottom of this mannose-binding domain of FimH is lined with 
asparagine, glutamine and aspartic acid residues and those skilled in the art would be able to 
use molecular modeling techniques and other existing protocol to design and synthesize 
antibacterial compounds. Such compounds would compete with mannose for binding to the 
FimH adhesin thereby preventing or inhibiting pilus adhesion to host epithelium. As such, 
these compounds may be used in methods of preventing or inhibiting pili adhesion to a host 
tissue. ; 

The present invention also provides a method for inhibiting bacterial colonization by i 
Gram-negative organism. This method involves administration of a compound which will 
interfere with the binding of a chaperone to a pilus subunit, thereby preventing the assembly 
of an intact pilus structure. In a preferred embodiment ofthe invention, a method of 
preventing orinhibiting the assembly of piliis subunits is provided by interfering with,in the 
PapK pilus subunit, a binding site which is normally involved in the bindaigno pilus subunits 
during transport of these pilus subunits through the periplasmic space and/or during the 
^process of pilus assembly. In anotherembodiment of the invention, a method of preventing 
or inhibiting the assembly of pilus subunits is provided by interfering with, in the FimC pilus 
subunit, a binding site which. is normally involved in the binding to pilus subunits during 
transport of these pilus subunits through the periplasmic space and/or during the process of 
pilus assembly. ' v 1 ' r 

Antibacter ial compounds and pharmaceutical compositions 

In another preferred embodiment of the invention, a method of preventing or 
inhibiting the assembly of pilus subunits is provided by administering an antibacterial 
compound which will mimic the capability of a periplasmic chaperone or a pilus subunit to 
bind to a pilus subunit. Also provided is a method of preventing or inhibiting the adhesion of 
a pilus to a host tissue by administering an antibacterial compound which will bind to a pilus 
mannose-binding domain. 

The antibacterial compositions of the present invention may be utilized to inhibit pili 
assembly and/or pili adhesion by providing an effective amount of such compositions to a 
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patient 

For use as antimicrobials for treatment of animal subjects, the compounds of the 
invention can be formulated as pharmaceutical or veterinary compositions. Depending on the 
subject to be treated, the mode of administration, and the type of treatment desired, e.g., 
prevention, prophylaxis, therapy; the compounds are formulated in ways consonant with 
these parameters. A summary of such techniques is found in Remington's Pharmaceutical 
Sciences, latest edition, Mack Publishing Co., Easton, PA. 

For administration to animal or human subjects, the dosage of the compounds of the 
invention is typically 0.l-100mg/kg. However, dosage levels are highly dependent on the 
nature of the infection, the condition of the patient, the judgment of the practitioner, and the 
frequency and mode of administration. The dosage of such a substance is expected to be the 
dosage which is normally employed when administering antibacterial drugs to patients or 
animals, i.e. 1 yg- 1000 U g per kilogram ofbody weight per day. The dosage will depend 
partly on the route of administration of the substance. If theoral route is employed, the 
absorption of the substance will be an important factor. A low absorption will have the effect 
that in the gastro-intestinal tract higher concentrations, and thus higher dosages, will be 
necessary. Also, the dosage of such a substance when treating infections of the central 
nervous system (CNS) will be dependent on the permeability of the blood-brain barrier for 
the substance. As is well-known in the treatment of bacterial meningitis with penicillin, very 
high dosages are necessary in order to obtain effective concentrations in the CNS. 

It will be understood that the appropriate dosage of the substance should suitably be 
assessed by performing animal model tests, wherein the effective dose level {e.g. ED 50 ) and 
the toxic dose level (e.g. TD 50 ) as well as the lethal dose level (e.g. LD M or LD I0 ) are 
established in suitable and acceptable animal models. Further, if a substance has proven 
efficient in such animal tests, controlled clinical trials should be performed. Needless to state 
such clinical trials should be performed according to the standards of Good Clinical Practice. 

In general, for use in treatment, the compounds of the invention may be used alone or 
in combination with other antibiotics such as erythromycin, tetracycline, macrolides, for 
example azithromycin and the cephalosporins. Depending on the mode of administration, the 
compounds will be formulated into suitable compositions to permit facile delivery to the 
affected areas. 
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Formulations may be prepared in a manner suitable for systemic administration or 
topical or local administration. Systemic formulations include those designed for injection 
(e.g., intramuscular, intravenous or subcutaneous injection) or may be prepared for 
transdermal, transmucosal, or oral administration. The formulation will generally include a 
diluent as well as, in some cases, adjuvants, buffers, preservatives and the like. 

For oral administration, the compounds can be administered also in liposomal 
compositions or as microemulsions. Suitable forms include syrups, capsules, tablets, as is 
understood in the art. For injection, formulations can be prepared in conventional forms as 
liquid solutions or suspensions or as solid forms suitable for solution or suspension in liquid 
prior to injection or as emulsions. Suitable excipients include, for example, water, saline, 
dextrose, glycerol and the like. Such compositions may also contain amounts of nontoxic 
auxiliary substances such as wetting or emulsifying agents, pH buffering agents and the like, 
such as, for example, sodium acetate, sorbitan monolaurate, and so forth. 
- • It will be understood that the above-described methods comprising administration of 
substances in treating and/or preventing diseases are dependent on the identification or de 
novo design of substances which are capable of exerting effects which will lead to prevention 
or inhibition of the interaction between pilus subunits and periplastic molecular chaperones.. 
It is further important that these substances will have a high chance of being therapeutically 
active. , ■ - 

Thus clinical experimental trials and animal studies can be undertaken to demonstrate 
the therapeutic efficacy of peptide mimics and analogues for preventing or inhibiting pilus 
assembly. The efficacy of such compounds can be shown using methods known in the art, 
iricrudingpilusinhibiKo specifically ELISA of iTemaggiuHnation." " - 

The antibacterial compositions of the present invention also have a variety of 
industrial uses, well known to those skilled in such arts, relating to their antibacterial 
properties. In general, these uses are carried out by bringing a biocidal or bacterial inhibitory 
amount of the antibacterial compositions of the present invention into contact with a surface, 
environment or biozone containing Gram-negative bacteria so that the composition is able to 
interact with and thereby interfere with the biological function of such bacteria. For example, 
such antibacterial compositions can be used to prevent or inhibit biofilm formation caused by 
Gram-negative bacteria and to inhibit bacterial colonization by a Gram-negative organism. 
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Compositions may be formulated as sprays, solutions, pellets, powders and in other forms of 
administration well known to those skilled in such arts. 

• Crystalline PapD-PapK rh a r one .s llh „ nit rn. Comn l e v an H 
FimH-FimC Chaoemnft-AHh esin Q..rnmp W 
The present invention provides, for the first time, the high-resolution three- 
dimensional structure and atomic structure coordinates ofthe crystalline co-complexes ofthe 
PapD-PapK chaperone-subunit as determined by X-ray crystallography. Also provided for 
usage in the methods ofthe present invention is the high resolution three dimensional 
structures and atomic structure coordinates for the crystalline co-complexes ofthe FimC- 
FimH chaperone-adhesin as determined by X-ray crytallography. The specific methods used 
to obtain the structure coordinates are provided in the examples, infra. The atomic structure 
coordinates of crystalline PapD-PapK co-complex, obtained from the co-crystal to 2.4 A 
resolution, are listed in Table 4. The atomic structure coordinates of crystalline FimC-FimH 
co-complex, obtained from the co-crystal to 2.5 A resolution, are listed in Table 5. 

Additional antibacterial compounds can be modeled and synthesized utilizing the 
atomic coordinates obtained from the resolution ofthe co-crystal structure of the PapD-PapK 
chaperone-subunit co-complex and the FimC-FimH chaperone-adhesin co-complex. For ' 
example, as discussed herein, applicants utilized the co-crystal structure ofthe FimC-FimH 
chaperone-adhesion co-complex to identify the NH 2 .terminal mannose-binding domain of ^ 
FimH, an essential component required for pilus adhesion to host tissues. As the COOH- . 
terminus of pilus subunits in many tissue-adhering bacteria have been found to be highly ~ 
conserved, it is believed that the antibacterial compounds of the present invention are capable 
of interacting with the majority of pilus subunits and thus are useful in the treatment of 
various diseases caused by piliated bacteria. 

Thus, the invention encompasses a co-crystal ofapilus chaperone-subunit co- 
complex comprising an amino acid sequence of a G, beta-strand of a periplasmic chaperone 
and an amino acid sequence from the amino-terminal sequence of a pilus subunit. Preferably, 
the amino acid sequence of a G, beta-strand would be the N101 to L107 amino acid region of 
a G, beta-strand of a pilus chaperone, and even more preferably, the amino acid sequence of a 
G, beta-strand would be the N101 to L107 amino acid region of a G, beta-strand of a PapD 
chaperone and most preferably, the amino acid sequence ofthe G, beta-strand would be SEQ 
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ID NO: L Preferably, the amino acid sequence of the amino-terminal sequence would be 
from the N-terminal sequence of a PapK subunit, andimore preferably, the amino acid 
sequence of the amino-terminal sequence would be the amino acid sequence of SEQ ID NO: 
12. In a preferred embodiment, the co-crystal is a crystalline form of the polypeptides 
corresponding to the PapD-PapK chaperone-subunit co-complex. In a preferred embodiment 
of the invention, the co-crystal effectively diffracts X-rays for the determination of the atomic 
coordinates of the pilus chaperbne-subunit co-complex to'.avresblution of fromabout 3 
angstroms to about 2.4 angstroms or greater. 

Preferably, co-crystals of the invention comprise crystallized polypeptides 
corresponding to the wild-type PapD-PapK chaperone-subunit co-complex. The co-crystals 
of the invention include native co-crystals in which the crystallized PapD-PapK chaperone- 
subunit co-complex is substantially pure and heavy-atom atom derivative co-crystals in which 
the crystallized PapD-PapK chaperone-subunit co-complex is in association with one or more 
heavy-metal atoms. The co-crystals from which the atomic structure coordinates of the 
crystalline co-complexes of the present, invention m# be obtained include native coi^stak 
and heavy-atom derivative co-crystals. Native co-crystals generally comprise substantially 
pure polypeptides corresponding to the PapD-PapK co-complex in crystalline form. 

It is to be understood that the crystalline PapD-PapK co-complex from which the 
atomic structure coordinates of the invention can be obtained is not limited to the wild-type' 
PapD-PapK co-complex. Indeed, the co-crystals may comprise mutants of the wild-type co- 
complex. Mutants of wild-type co-complexes are obtained by replacing at least one amino 
. apll^MmJJw^uOTci^LQm orboto th^^ 

complex with a different amino acid residue, or by adding or deleting one or more amino acid 
residues within the wild-type sequences and/or at the N- and/or C-terminus of one of both of 
the polypeptides comprising the wild-type co-complex. Preferably, such mutants will 
crystallize under crystallization conditions that are substantially similar to those used to 
crystallize the wild-type co-complex. ' 

The types of mutants contemplated by this invention include conservative mutants, 
non-conservative mutants, deletion mutants, truncated mutants, extended mutants, methionine 
mutants, selenomethionine mutants, cysteine mutants and selenocysteine mutants. A mutant 
may have, but need not have, pilus subunit binding activity. Preferably, a mutant displays 
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biological activity that is substantially similar to that of the wild-type polypeptide: 
Methionine, selenomethione, cysteine, and selenocysteine mutants are particularly useful for 
producing heavy-atom derivative co-crystals, as described in detail, below. 

It will be recognized by one of skill in the art that the types of mutants contemplated 
herein are not mutually exclusive; that is, for example, a polypeptide having a conservative 
mutation in one amino acid may in addition have a truncation of residues at the N-terminus, 
and several Leu or He -* Met mutations. 

Sequence alignments of polypeptides in a protein family or of homologous 
polypeptide domains can be used to identify potential amino acid residues in the polypeptide 
sequence that are candidates for mutation. Identifying mutations that do not significantly 
interfere with the three-dimensional structure of the PapD-PapK co-complex and the FimC- 
FimH co-complex and/or that do not deleteriously affect, and that may even enhance, the 
activity of the co-complex will depend, in part, on the region where the mutation occurs. 

Conservative amino acid substitutions are well-known in the art, and include 
substitutions made on the basis of a similarity in polarity, charge, solubility, hydrophobic^ 
and/or the hydrophilicity of the amino acid residues involved. Typical conservative 
substitutions are those in which the amino acid is substituted with a different amino acid that 
is a member of the same class or category, as those classes are defined herein.' Thus, typical 
conservative substitutions include aromatic to aromatic, apolar to apolar, aliphatic to 
aliphatic, acidic to acidic, basic to basic, polar to polar, etc. Other conservative amino acid 
substitutions are well known in the art. It will be recognized by those of skill in the art that 
generally, a total of about 20% or fewer, typically about 1 0% or fewer, most usually about 
5% or fewer, of the amino acids in the wild-type polypeptide sequence can be conservatively 
substituted with other amino acids without deleteriously affecting the biological activity 
and/or three-dimensional structure of the molecule, provided that such substitutions do not 
involve residues that are critical for activity, as discussed above.. 

The heavy-atom derivative co-crystals from which the atomic structure coordinates of 
the invention are obtained generally comprise a crystalline co-complex in association with 
one or more heavy metal atoms. The polypeptides may correspond to a wild-type or a mutant 
PapD-PapK co-complex or FimC-FimH co-complex, which may optionally be further 
associated with one or more molecules. There are two types of heavy-atom derivatives of 
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polypeptides: heavy-atom derivatives resulting from exposure of the proteins to a heavy metal 
in solution, wherein co-crystals are grown in medium comprising the heavy metal, or in , 
crystalline form, wherein the heavy metal diffuses into the co-crystal, and heavy-atom 
derivatives wherein at least one of the polypeptides in the co-complex comprises heavy-atom 
containing amino acids, e.g. , selenomethionine and/or selenocysteine mutants 

In practice, heavy-atom derivatives of the first type can be formed by soaking a native 
co-crystal in a solution comprising heavy metal atom salts, or organometallic compounds, 
e.g., lead chloride, gold thiomalate, thimerosal, uranyl acetate, platinum tetrachloride, 
osmium tetraoxide, zinc sulfate, and cobalt hexamine, which can diffuse through the co- 
crystal and bind to the crystalline polypeptides. 

Heavy-atom derivatives of this type can also be formed by adding to a crystallization 
solution comprising the polypeptides to be co-crystallized an amount of a heavy metal atom 
salt, which may associate with at least one of the protein and be incorporated into the co- 
crystal. The location(s) of the bound heavy metal atom(s) can be determined by X-ray 
diffraction analysis of the co-crystal. This information, in turn, is used to generate the phase 
information needed to construct the three-dimensional structure of the proteins in the co- . ■ ... 

complex. 

The native and/or heavy-atom derivative co-crystals from which the atomic structure 
coordinates of the invention are obtained can be obtained by conventional means as are well- 
known in the art of protein crystallography, including batch, liquid bridge, dialysis, and vapor 
diffusion methods (see, e.g., McPherson, 1982, Preparation and Analysis of Protein Crystals, 
John Wiley, New York; McPherson, 1990, Eur. J. Biochem. 189:1-23.; Weber, 1991, Adv. 
Protein Chem. 41:1-36.). Generally, native co-crystals are grown by dissolving substantially 
pure polypeptide encoding for the PapD-PapK co-complex or the FimH-FimC co-complex in 
an aqueous buffer containing a precipitant at a concentration just below that necessary to 
precipitate the protein. Examples of precipitants include, but are not limited to, polyethylene 
glycol, ammonium sulfate, 2-methyl-2,4-pentanediol, sodium citrate, sodium chloride, 
glycerol, isopropanol, lithium sulfate, sodium acetate, sodium formate, potassium sodium 
tartrate, ethanol, hexanediol, ethylene glycol, dioxane, t-butanol and combinations thereof. 
Water is removed by controlled evaporation to produce precipitating conditions, which are 
maintained until co-crystal growth ceases. 
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In a preferred embodiment, native co-crystals are grown by vapor diffusion in hanging 
drops (McPherson, 1982, Preparation and Analysis of Protein Crystals, John Wiley New 
York; McPherson, 1990, Eur. J. Biochem. 189:1-23.). In this method, the 
• polypeptide/precipitant solution is allowed to equilibrate in a closed container with a larger 
aqueous reservoir having a precipitant concentration optimal for producing crystals. 
Generally, less than about 25 U L of substantially pure polypeptide solution is mixed with an 
equal volume of reservoir solution, giving a precipitant concentration about half that required 
for crystallization. This solution is suspended as a droplet underneath a coverslip, which is 
sealed onto the top of the reservoir. The sealed container is allowed to stand, usually for 
about 2-6 weeks, until co-crystals grow. 

Heavy-atom derivative co-crystals can be obtained by soaking native co-crystals in 
mother liquor containing salts of heavy metal atoms. Further, heavy-atom derivative co- 
crystals can also be obtained from SeMet and/or ScCys mutants, as described above for 
native co-crystals. 

Mutant proteins may crystallize under slightly different crystallization conditions than 
wild-type protein, or under very different crystallization conditions, depending on the nature 
of the mutation, and its location in the protein. For example, a non-conservative mutation ' 
may result in alteration of the hydrophilicity of the mutant, which may in turn make the 
mutant protein either more soluble or less soluble than the wild-type protein. Typically, if a 
protein becomes more hydrophi lie as a result of a mutation, it will be more soluble than the 
wild-type protein in an aqueous solution and a higher precipitant concentration will be needed 
to cause it to crystallize. Conversely, if a protein becomes less hydrophilic as a result of a 
mutation, it will be less soluble in an aqueous solution and a lower precipitant concentration 
will be needed to cause it to crystallize. If the mutation happens to be in a region of the 
protein involved in crystal lattice contacts, crystallization conditions may be affected in more 
unpredictable ways. ' 

The dimensions of a unit cell of a crystal are defined by six numbers, the lengths of 
three unique edges, a, b, and c, and three unique angles, a, 0 and y. The type of unit cell, that 
comprises a crystal is dependent on the values of these variables. In one embodiment, the co- 
crystal of the PapD-PapK pilus chaperone-subunit co-complex has the space group of P2.2.2, 
with unit cell dimensions of a = 62. 1 ± 0.2 angstroms, b = 63.6 =b 0.2 angstroms and c = 92.7 ' 
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;± 0.2 angstroms such that the three dimensional structure of the crystallized co-complex can 
be determined to. a resolution of from about 3 angstroms to about 2.4 angstroms or greater. In 
another embodiment, the co-crystals of the FimC-FimH chaperone-adhesin co-complex has 
the space group ?4 X 2{1 of P4 3 with unit cell dimensions of a=b= 97.7 ± 0.2 angstroms and c = 
215.9 ± 0.2 angstroms such that the three-dimensional structure of the co-complex can be 
determined to a resolution of from about 3 angstroms to about 2.5 angstroms or greater. 

When a crystal is placed in an X-ray beam, the incident X-rays interact with the 
electron cloud of the molecules that make up the crystal, resulting in X-ray scatter. The 
combination of X-ray scatter with the lattice of the crystal gives rise to nonuniformity of the 
scatter; areas of high intensity are called diffracted X-rays. The angle at which diffracted 
beams emerge from the crystal can be computed by treating diffraction as if it were reflection 
from sets of equivalent, parallel planes of atoms in a crystal (Bragg's Law). The most 
obvious sets of planes in a crystal lattice are those that are parallel to the faces of the unit cell 
These and other sets of planes can be drawn through the lattice points. Each set of planes is 
identified by three indices, hkl. The h index gives the number of parts . into which the a edge 
pf the unit cell is cut, the k index gives the number of parts into which the b edge of the unit 
cell is cut, and .the 1 index gives the number of parts into which the c edge of the unit cell isT 
cut by the set of hkl planes. Thus, for example/the 235 planes cut the a edge of each unit cell 
into halves, the b edge of each unit cell into thirds, and the c edge of each unit cell into fifths. 
Planes that are parallel to the be face of the unit cell are the 100 planes; planes that are 
parallel to the ac face of the unit cell are the 010 planes; and planes that are parallel to the ab 
face of the unit cell are the 001 planes. 

When a detector is placed in the path of the diffracted X-rays, in effect cutting into the 
sphere of diffraction, a series of spots, or reflections, are recorded to produce a "still" 
diffraction pattern. Each reflection is the result of X-rays reflecting off one set of parallel 
planes, and is characterized by an intensity, which is related to the distribution of molecules 
in the unit cell, and hkl indices, which correspond to the parallel planes from which the beam 
producing that spot was reflected. If the crystal is rotated about an axis perpendicular to the 
X-ray beam, a large number of reflections is recorded on the detector, resulting in a 
diffraction pattern. 

The unit cell dimensions and space group of a crystal can be determined from its 
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diction pattern. First, the spacing of reflections is inversely proportional to the lengths of 
the edges of the unit cell. Therefore, if a diffraction pattern is recorded when the X- ray beam 
» perpendicular to a face of the unit cell, two of the unit cell dimensions may be deduced 
from the spacing of the reflections in the x and y directions of the detector, the crystal-to- 
detector distance, and the wavelength of the X-ray, Those of skill in the art will appreciate 
that, in order to obtain all three unit cell dimensions, the crystal must be rotated such that the 
X-ray beam is perpendicular to another face of the unit cell. Second, the angles of a unit cell 
can be determined by the angles between lines of spots on the diffraction pattern Third the 
absence of certain reflections and the repetitive nature of the diffraction pattern, which may 
be evtdent by visual inspection, indicate the internal symmetry, or space group, of the crystal 
Therefore, a crystal may be characterized by its unit cell and space group, as well as by its 
diffraction pattern. 

Once the dimensions of the unit cell are determined, the likely number of polypeptides 
» the asymmetric unit can be deduced from the size of the polypeptide, the density of the " 
average protein, and the typical solvent content of a protein crystal, which is usually in the 
range of 30-70% of the unit cell volume. 

The diffraction partem is related to the three-dimensional shape of the molecule by a 
Founer transform. The process of determining the solution is in essence a re-focusing of the 
d.ffracted X-rays to produce a three-dimensional image of the molecule in the crystal Since 
re-focusmg of X-rays cannot be done with a lens at this time, it is done via mathematical 
operations. 

The sphere of diffraction has symmetry that depends on the internal symmetry of the 
crystal, which means that certain orientations of the crystal will produce the same set of 
reflects, Thus, a crystal with high symmetry has a more repetitive diffraction pattern, and 
there are fewer unique reflections that need to be recorded in order to have a complete 
representation of the diffraction. The goal of data collection, a dataset, is a set of consistently 
measured, indexed intensities for as many reflections as possible. A complete dataset is 
collected if at least 80%, preferably at least 90%, most preferably at least 95% of unique 
reflections are recorded. In one embodiment, a complete dataset is collected using one 
crystal. In another embodiment, a complete dataset is collected using more than one crystal 
of the same type. 
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Sources of X-rays include, but are not limited to, a rotating anode X-ray generator 
such as a Rigaku RU-200 or a beamline at a synchrotron light source, such as the Advanced 
Photon Source at Argonne National Laboratory. Suitable detectors for recording diffraction 
patterns include, but are not limited to, X-ray sensitive film, multiwire area detectors, image 
plates coated with phosphorus, and CCD cameras. Typically, the detector and the X-ray 
beam remain stationary, so that, in order to record diffraction from different parts of the 
crystal's sphere of diffraction, the crystal itself is moved via an automated system of 
moveable circles called a goniostat. 

One of the biggest problems in data collection, particularly from macromolecular 
crystals having a high solvent content, is the rapid degradation of the crystal in the X-ray 
beam. In order to slow the degradation, data is often collected from a crystal at liquid 
nitrogen temperatures. In order for a crystal to survive the initial exposure to liquid nitrogen, 
the formation of ice within the crystal must be prevented by the use of a cryoprotecu?nt. 
Suitable cryoprotectants include, but are not limited to, low molecular weight polyethylene 
glycols, ethylene glycol, sucrose, glycerol, xylitoL aM combinations thsreoE Crystals may 
be soaked in a solution comprising the one or more cryoprotectants prior to exposure to liquid 
_ nitrogen, or the one or more cryoprotectants may be added to the crystallization solution. - 
Data collection at liquid nitrogen temperatures may allow the collection of an entire dataset 
from one crystal. 

Once a dataset is collected, the information is used to determine the three-dimensional 
structure of the molecule in the crystal. However, this cannot be done from a single 
.™^. l !ff^5? t .^^^".^^i t !^ becau ?e certain information, known as phase ; 
information, is lost between the three-dimensional shape of the molecule and its Fourier 
transform, the diffraction pattern. This phase information must be acquired by methods 
described below in order to perform a Fourier transform on the diffraction pattern to obtain 
the three-dimensional structure of the molecule in the crystal. It is the determination of phase 
information that in effect refocuses X-rays to produce the image of the molecule. 

One method of obtaining phase information is by isomorphous replacement, in which 
heavy-atom derivative crystals are used. In this method, the positions of heavy atoms bound 
to the molecules in the heavy-atom derivative crystal are determined, and this information is 
then used to obtain the phase information necessary to elucidate the three-dimensional 
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A.ota me ,bod of ob.ai.ring phase info™,™ is by molecular rep|acemMt > 

whose suture „ known withi „ ^ ^ „„ of ^ ^ £ Jjjfj"* 

ta d. onemed and positioned po lyp ep,ide and combined with observed an.pb.ndes ,0 
prov.de an approxtaare Fourier syndesis of ,he Stra «nre of ,be molecnte comprising ,be 
new ^ ,985, Methods in En^o.og, „,,, 77; R _ ; ^ j£ 

Mo,ecu ar RepUcemeu, Med,„,» tot Sci. Rev. Se, No. ,3, Gordon * Breach, Ke W York, 
A thno memodorphasedetemrinabon is mu„i-wave I e„ gth ano m a,ons dispersion or 
MAD. In », me-hod, X-ray diffracrion da,a are co„ec,ed a, severa, difforen, „ave,engj 
fen, a s.ng.e crys,a, confining a, ,eas, one heavy a,om wi,h abso^on edges near Una 
^ of .conring X-ray radiarion. Tbe resonance between X-rays and Cectron orbi,a,s 

d.scuss,on of MAD ana Iy s,s can be fonnd in Hendrickson, ,985. Tran, ^ Oysubog, 
Assoc., ,; Hendrickson a,,,., ,990, EMBO ,. 9:1665; and Hendrickson, Ml.Science 

A four* melhod of de.e™ im „ gphaS e information is sing,e wavelengfl, anomalous 
d.spers,o„orSAD. In this technique, X-ray diffraction data are collected at a single ' 

asingienariveorheavy-ato™ derivative cryata,, andpbase info^ationis 

1" 17! 7 sca,,ering inforraata from a,oms * - 

»* crysta. o, from the heavy atoms in , he heavy . atom d£riva(ive 

1«I * : *" f ° r,WS Phasin8 *' chnique nMd "» b ° ** * «- -n-cf- 

edg of.heanoma.onascane.r. A detailed discussion of SAD analysis can be found in ' 
Brodersen et al., 2000, Acta Cryst., D56:43,wl41. 

with ano A T me ' h0d ^ hf0m,a ' iOn " Sing '= isomo n.bous repiacemen, ' 

w.th anoma,ous scanenng or S1RAS. This ,echni<,„e combines isomcphous rep,acemen. 
and anomafcus seafaring ,ecb»i q ues ,„ provide phase infbrmadon for a crysta, of a 
PO-ypeptide. X-ray diffracrion da» are cobec.ed a, a sing.e wavCeng.b, usuaby from a si„ gl e 
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heavy-atom derivative crystal. Phase information obtained only from the location of the 
heavy atoms in a single heavy-atom derivative crystal leads to an ambiguity in the phase 
angle, which is resolved using anomalous scattering from the heavy atoms Phase 
information is therefore extracted from both the location of the heavy atoms and from 
anomalous scattering of the heavy atoms. A detailed discussion of SIRAS analysis can be 
found in North, 1965, Acta Cryst. 18:212-216; Matthews, 1966, Acta Cryst. 20:82-86. 
, Once phase information is obtained, it is' combined with' ihe diffraction -data to- 
produce an electron density map, an image of the electron clouds that surround the molecules 
in the unit cell. The higher the resolution of the data, the more distinguishable are the 
features of the electron density map, eg, amino acid side chains and the positions of 
carbonyl oxygen atoms in the peptide backbones, because atoms that are closer together are 
resolvable. A model of the macromolecule is then built into the electron density map with the 
aid of a computer, using as a guide all available information, such as the polypeptide . 
: sequence and the established rules of molecular stmcrure J and-stere6chemistry. Interpreting * 
the electron density map is a process of finding the chemically realistic conformation that fits 
the map precisely. 

: : , After a model is generated, a structure is refined. Refinement is the process of * 

minimizing the function o, which is the difference between observed and calculated intensity 
values (measured by an R-factor), and which is a function of the position, temperature factor, 
and occupancy of each non-hydrogen atom in the model. This usually involves alternate 
cycles of real space refinement, .i.e., calculation of electron density maps arid model building, 
and reciprocal space refinement, i.e., computational attempts to improve the agreement 
•>Tw^-me w^"inf«iaty dato'Wid'mt«isity data gene'rated fiom'each 'successive model. 
Refinement ends when the function * converges on a minimum wherein the model fits the 
electron density map and is stereochemical^ and cbnformationaliy reasonable. During 
refinement, ordered solvent molecules are added to the structure. 

The atomic structure coordinates and machine readable media of the invention have a 
variety of uses. The present invention encompasses the structure coordinates and other 
information, e.g., amino acid sequence, connectivity tables, vector-based representations, 
temperature factors, etc., used to generate the three-dimensional structures of the polypeptides' 
for use in the software programs described below and other software programs. For example, 
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the coordinates are useful for solving the three-dimensional X-ray diffraction and/or solution 
structures of other proteins, including mutant Pa P D-Pa P K chaperone-subunit or FimC-FimH 
chaperone-adhesin co-complexes, PapD-PapK chaperone-subunit co-complexes or FimC- 
FimH chaperone-adhesin co-complexes that are further associated with other molecules and 
unrelated proteins, to high resolution. Structural information may also be used in a variety of 
molecular modeling and computer-based screening applications to, for example, intelligently 
design mutants of the crystallized PapD-PapK chaperone-subunit co-complex or the 
crystallized FimC-FimH chaperone-adhesin co-complex that have altered biological activity 
and to computationally design and identify compounds that bind the G, beta-strand of a 
periplasms chaperone, the amino-terminal end of a pilus subunit. Such compounds may be 
used as lead compounds in pharmaceutical efforts to identify compounds that inhibit pilus 
biogenesis as a therapeutic approach toward the treatment of several types of disease caused 
by pathogenic Gram-negative bacteria such as Escherichia coli. Haemophilus influenzae, 
Salmonella enteriditis. Salmonella typhimurium, Bordetella pertussis. Yersinia enterocolitica, 
Yersinia perstis, Helicobacter pylori and Klebsiella pneumoniae. 

In a further aspect of the invention, such potential antibacterial compounds are 
evaluated for their capacity to prevent or treat a bacterial infection. These methods comprise 
designing and synthesizing candidate antibacterial compounds using the atomic coordinates 
of the three dimensional structure of such co-crystals and screened for its ability to bind to 
pilus subunits thereby inhibiting or preventing pilis biogenesis. The antibacterial activity of 
the compound is determined by assaying the bacterium for infectivity or monitoring the pilus 
for activity. Such compounds which are able to prevent or inhibit pilus biogenesis or the 
ability of the bacterial pilus to infect a host tissue can be used in the pharmaceutical 
compositions of the present invention. 

Additionally, the invention encompasses machine readable media embedded with the 
three-dimensional structures of the models described herein, or with portions thereof. As 
used herein, "machine readable medium" refers to any medium that can be read and accessed 
directly by a computer or scanner. Such media include, but are not limited to: magnetic 
storage media, such as floppy discs, hard disc storage medium and magnetic tape; optical 
storage media such as optical discs or CD-ROM; electrical storage media such as RAM or 
ROM; and hybrids of these categories such as magnetic/optical storage media. Such media 
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further include paper on which is recorded a representation of the atomic structure . 
coordinates, e.g., Cartesian coordinates, that can be read by a scanning device and converted 
into a three-dimensional structure with an Optical Character Recognition (OCR). 

A variety of data storage structures are available to a skilled artisan for creating a 
computer readable medium haying recorded thereon the atomic structure coordinates of the 
invention or portions thereof and/or X-ray diffraction data. The choice of the data storage 
structure will generally be based on the means chosen to.access the stored information. In 
addition, a variety of data processor programs and formats can be used to store the sequence 
and X-ray data information on a computer readable medium. Such formats include, but are 
. not limited to, Protein Data Bank ("PDB") format (Research Collaboratory for Structural 
' Bioinformatics; http://www.rcsb.Org/pdb/docs/format/pdbguide2.2/guide2.2_fr 
Cambridge Crystallographic Data Centre format 

(http://wwwxcdc.cam.2c.uk/support/csd__doc/volume3/z323.htrr) 1); Structure-data ("SD") file 
format (MDL Information Systems/IncrDaiby a/., 1992, J. Chem. Inf. Comp. Sci. 32:244- 
255), and line-notation, e.g., as used in.SMILES (Weininger, 1988, J. Chem. Inf. Comp. Sci. 
28:31-36). Methods of converting between various formats read by differentcomputer 
software will be readily apparent to those ; of skill in the art, e.g., BABEL (v. 1.06, Walters & 
Stahl, ©1992, 1993, 1994; http://ww.brimel.ac.uk/departments/chenVbabeL All 
format representations of the polypeptide coordinates described herein, or portions thereof, 
are contemplated by the present invention. By providing computer readable medium having 
stored thereon the atomic coordinates of the invention, one of skill in the art can routinely 
access the atomic coordinates of the invention, or portions thereof, and related information for 
use in modeling and design programs, described in detail below. 

While Cartesian coordinates are important and convenient representations of the 
three-dimensional structure of a polypeptide, those of skill in the art will readily recognize 
that other representations of the structure are also useful. Therefore, the three-dimensional 
structure of a polypeptide, as discussed herein, includes not only the Cartesian coordinate 
representation, but also all alternative representations of the three-dimensional distribution of 
atoms. For example, atomic coordinates may be represented as a Z-matrix, wherein a first 
atom of the protein is chosen, a second atom is placed at a' defined distance from the first 
atom, a third atom is placed at a defined distance from the second atom so that it makes a 
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defined angle with the first atom. Each subsequent atom is placed at a defined distance from 
a prev t ously placed atom with a specified angle with respect to the third atom, and at a 
specified torsion angle with respect to a fourth atom. Atomic coordinates may also be 
represented as a Patterson function, wherein all interatomic vectors are drawn and are then 
Placed with their tails at the origin. This representation is particularly useful for locating 
heavy atoms in a unit cell. In addition, atomic coordinates may be represented as a series of 
vectors having magnitude and direction and drawn from a chosen origin to each atom in the 
polypeptide structure. Furthermore, the positions of atoms in a three-dimensional structure 
may be represented as fractions of the unit cell (fractional coordinates), or in spherical polar 
coordinates.' 

Additional information, such as thermal parameters, which measure the motion of 
each atom in the structure, chain identifiers, which identify the particular chain of a multi- 
chain protein or protein co-complex in which an atom is located, and connectivity 
information, which indicates to which atoms a particular atom is bonded, is also useful for 
representing a three-dimensional molecular structure. 

Uses of the Atomic Stru cture Chord j P? f ? c 

Structure information, typically in the form of the atomic structure coordinates, can be 
used in a variety of computational or computer-based methods to, for example, design, screen 
for and/or identify compounds that bind the crystallized polypeptide or a portion or fragment 
thereof, or to intelligently design mutants that have altered biological properties. 

In one embodiment, the co-crystals and structure coordinates obtained therefrom are 
useful for identifying and/or designing compounds that bind PapD, PapK, FimC or FimH as 
an approach towards developing new therapeutic agents. For example, a high resolution 
X-ray structure will often show the locations of ordered solvent molecules around the protein, 
and in particular at or near putative binding sites on the protein. This information can then be 
used to design molecules that bind these sites, the compounds synthesized and tested for 
binding in biological assays. Travis, 1993, Science 262:1374. 

In another embodiment, the structures are probed with a plurality of molecules to 
determine their ability to bind to PapD, Pa pK , FimC or FimH at various sites. Such 
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compounds can be used.as targets or leads in medicinal chemistry efforts to identify, for 
example, inhibitors of potential therapeutic importance. 

In specific embodiments described herein, the high resolution X-ray structures of the 
PapD/PapK and FimC/FimH co-complexes show details of the interactions between PapD 
and PapK, and between FimC and FimH, respectively. This information can be used to 
design molecules that bind, to the sites of interaction, thereby blocking co-complex formation. 
In addition, the X-ray structure of the FimC/FimH co-complex has a C-HEGA molecule 
bound in the mannose-bindihg pocket of FimH, which can be used to model compounds that 
bind to the lectin and inhibit the FimH interaction with mannose oligosaccharides on host 
cells. 

In yet another embodiment, the structures can be used to computationally screen small 

molecule data bases for chemical entities or compounds that can bind in whole, or in part, to 

PapD, PapK, FimC or FimH, In this screening, the quality 0 f fit of such entities or 

compounds to the binding site may be judged either by shape complementarity or by 

estimated interaction energy. Meng et'ai, 1 992, J. Comp. Chem. 2 3 :505-524. 

/ The design of compounds that bind to PapD, PapK, FimC or FimH according to this 

- invention generally involves consideration of two factors/First, the compound must be 

capable of physically and structurally associating with PapD, PapK, FimC or FimH. This 

association can be covalent or non-covalent. For example, covalent interactions may be 

important for designing suicide or irreversible inhibitors of a protein. Non-covalent 

molecular interactions important in the association of PapD with PapK or of FimC with FimH 

—include hydrogen-bonding,- ionic interactions and van defWaals and hydrophobic - - — 

interactions. Second, the compound must be able to assume a conformation that allows it to 

associate with PapD, PapK, FimC or FimH. Although certain portions of the compound will 

not directly participate in this association with the protein, those portions may still influence 

the overall conformation of the molecule. This, in rum, may have a significant impact on 

potency. Such conformational requirements include the overall three-dimensional structure 

and orientation of the chemical group or compound in relation to all or a portion of the 

binding site, or the spacing between functional groups of a compound comprising several 
chemical groups that directly interact with the protein. 

The potential inhibitory or binding effect of a chemical compound on PapD, PapK, 
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FimC or FimH may be analyzed prior to its actual synthesis and testing by the use of 
computer modeling techniques. If the theoretical structure of the given compound suggests . 
insufficient interaction and association between it and the protein, synthesis and testing of the 
compound is unnecessary. However, if computer modeling indicates a strong interaction, the 
molecule may then be synthesized and tested for its ability to bind to the protein and inhibit 
its activity. In this manner, synthesis of ineffective compounds may be avoided. 

An inhibitory or other binding compound of PapD, PapK, FimC or FimH may be 
computationally evaluated and designed by means of a series of steps in which chemical 
groups or fragments are screened and selected for their ability to associate with the individual 
binding pockets or interface surfaces of each of the proteins. One skilled in the art may use 
one of several methods to screen chemical groups or fragments for their ability to associate 
with PapD, PapK, FimC or FimH. This process may begin by visual inspection of, for 
example, the protein/protein interfaces or the mannose-binding site of FimH on the computer 
screen based on the PapD/PapK or FimC/FimH co-complex coordinates. Selected fragments 
or chemical groups may then be positioned in a variety of orientations, or docked, at an 
indi vidual surface of PapD, PapK, FimC or FimH that participates in a protein/protein 
interface in the co-complex, or in the mannose-binding pocket of FimH, as defined supra. 
Docking may be accomplished using software such as QUANTA and SYBYL, followed by 
energy minimization and molecular dynamics with standard molecular mechanics forcefields, 
such as CHARMM and AMBER. 

Specialized computer programs may also assist in the process of selecting fragments 
or chemical groups. These include. 

1. GRID (Goodford, 1985, J. Med. Chem. 28:849-857). GRID is available from 
Oxford University, Oxford, UK; 

2. MCSS (Miranker & Kaiplus, 1991, Proteins: Structure, Function and Genetics 
1 1:29-34). MCSS is available from Molecular. Simulations, Burlington, MA; 

3. AUTODOCK (Goodsell & Olsen, 1 990, Proteins: Structure, Function, and 
Genetics 8: 195-202). AUTODOCK is available from Scripps Research Institute, La Jolla, 
CA; and 

4. DOCK(Kuntze/a/„ 1982, J. Mol. Biol. 161:269-288). DOCK is available 
from University of California, San Francisco, CA. 
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Once suitable chemical groups or fragments have been selected, they can be 
assembled into a single compound or inhibitor. Assembly may proceed by visual inspection 
of the relationship of the fragments to each other in the three-dimensional image displayed on 
a computer screen in relation to the structure coordinates of PapD, PapK, FimC or FimH. 
This would be followed by manual model building using software such as QUANTA or 
SYBYL. 

• -Useful programs to aid.one of skill in the art m connecting'the mtfviaual chemical- 
groups or fragments include: 

.1. CAVEAT (Bartlette/ al, 1989, 'CAVEAT: A Program to Facilitate the 
Structure-Derived Design of Biologically Active Molecules'. In Molecular Recognition in 
Chemical and Biological Problems', Special Pub., Royal Chem. Soc. 78:182-196). CAVEAT 
is available from the University of California, Berkeley, CA; 

- 2, 3D Database systems such as MACCS-3D (MDL Information Systems, San 
Leandrc^Calif). This areals reviewed inMartin, 1992, J. Med. Chem. 35:2145-2154); and . 
3. HOOK (available from Molecular Simulations, Burlington, Mass.). 
• l0Ste ^ 0f P roceedin § to an inhibitor of PapD/PapK or FimC/FimH co-complex 
-formation, or of mannose-binding to FimH, in a step-Wise fashion one fragment or chemical . 
group at a time, as described above, PapD-, PapK-, FimC- or FimH-binding compounds may 
• be designed as a whole or 'de novo' using either an emp ty binding site or the surface of a 
. .protein that participates in protein/protein interactions in a co^omplex, pr optionally 
including some portion(s) of a known inhibitors) or of the second protein in the co-complex 

. * at J^^ Thesemethods- — - 

include: 

1. LUDI (Bohm, 1992, J. Comp. Aid. Molec, Design. : 6:6I-7$). LUDI is available. ' 
from Molecular Simulations, Inc., San Diego, CA; 

2. LEGEND (Nishibata & Itai, 1991, Tetrahedron 47:8985). LEGEND is 
available from Molecular Simulations; Burlington, Mass.; and 

3. LeapFrog (available from Tripos, Inc., St. Louis, Mo.). 

Other molecular modeling techniques may also be employed in accordance with this 
invention. See, e.g., Cohen et al., 1990, J. Med. Chem: 33:883-894. See ato, Navia & 
Murcko, 1992, Current Opinions in Structural Biology 2:202-210. 
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Mb wb°rr mmpomi has been dKisned ° r ^ * ,he ab ^ — * *■ 

w.th whtch .ha, compound ma, bind „ Pa pD , PapIC , FimC or FjmH ^ be ^ 
opbtnized by computation., evaluation. For examp.e. a compound una, baa been design or 
-tad ,o fimction as a FimH manrtose-binding inhibitor must alao preferab.y occupy a 
vo.ume no, overtopping toe volume occupied by me maanose-binding site residues when 
manno* is bound. Ar, effective inhibitor of PapD/PapK or FimOFimHco-compiex 

Terence in energy between inbound and See states (,,. i, must have a smal. deformans 
energy of bmding). Thus, me mos, efficient inhibitors shouid preferab.y be designed with a 
deformaoon energy of binding df no, greater than about .0 kcal/mo,, preferabiy, no, greater 
than 7 kcai/moi. Inhibitors may interact with the protein in more man one conformation that 
tostmUar m overa,, binding energy. In those caaes. me defonuation energy of binding is 
taken to be me difference between the energy of the free compound and the average energy of 
the conformations observed when the inhibitor binds to the protein. 

AC °"" IOU " dMle * d -<'^forbindi„gtoPapD,PapK,FimCorFimHmaybe 
filter computational optimized so that in its ^ stal « it wouId pre ^ 

electrostahcnteractionwiththetargetprotein. Such non-compiementary CecOostaUc 
mteracnons inc.ude repuisive charge-charge, dipoto-dipoto and charge-dipoie interactions. 
SpecficaUy, me sum of ail eiectrostatic interactions between me inhibitorand the protein 
when the inhibitor is bound to it preferably make a „eu WI or favorabte contribution to toe - 
enthalpy of binding. 

Specific computer software is available in the ar, to evaiuate compound deformation 
energy and electrostaUc interaction. Examp.es of prograns designed for such uses include- 
Ga^an 92, revision C (Ffisch. Gaussian, fire.. Pittsburgh. PA. ©,992); amber, versio „ 
4.0 (JCoUman. Untversiry of California a, San Francisco, ©,994); QUANT A/CHARMM 
(Molecmar Simulations. Inc., Bur.tog.on, MA, ©,994); and Insigh, rwWerOiiosym 
Techno.ogies too, San Diego, CA, 0,994). These programs may be implement, for 
tns,ance, using a compu.ee workstation, as arc wcl.-xnown i„ th e ait . 0 ,her hardware systems 
and software packages will be known to those skilled in the art. 

Once a PapD-, PapK-, FimC- or FimH-binding compound has been optimally selected 
or designed, as described above, subs.iru.ions may then be made in some of its atoms or 
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chemical groups inorder to improve or modify its binding properties. Generally, Ufa, 
sobsmutioo, are conservative, me replacement group will have approximately «he same 

shape, hydrophobicity and charge as the original group. One of skin in the « will 
understand tha, auba.im.iom known in the at, «o alter conformation should be avoided Such 
altered chemtcal compounds may men be analyzed for efficiency of binding to PapD, PapK, 
, FtmCorFtrnHbyatesamecomputermethodsdescribedindetailabove - 

Because PapD/PapK co-complexes may crystallize in more ton one crya.al fonn me 
structure coordinates of PapD/PapK co-complex, of PapD alone, of PapK alone, or of 
portions thereof, are particularly usefu. ,„ me stnKture of ^ other c0<fysta , ^ 
of PapD/PapK co-complex. Hey may also boused to solve the structure of mutants of 
PapD/PapK co-complex further complexed to another molecule, or of to crysfanine form of 
any ofher protein or protein co-complex with significant amino acid sequence homology ,o 
any functional domain of PapD or PapK. Similarly, the tectfc coordinates of Fim^imH 
co-complex, of FimC alone, of FimH alone, or of portions .hereof, are particularly useful to 
solve the structure of ofher co-crystal forma of F jmC/FimH co-complex. They may also be 
used ,o solve the structure of mutants, of FimC/FimH co-complex further complexed to 
another molecule, or of to crystalline fonu of a„y as p ro ,ein or protein co-complex with 
stgntficant amino acid sequence homology to any functional domain of FimC or FimH ' 
One method to, may be employed for this purpose is molecular replacement, fn this 
method, the unknown co-crystal structure, whether if is another oo crystal fonn of a 
PapD/PapK or FimC/FimH co-complex, a mutant, a PapD/PapK or FimGFjmH complex 
-tottsftrthereompfcedf^anote 
co-complex wifh significant amino acid sequence homology ,„ any fimctiona. domain of one 
•of the proteins in the co-complex crystal, may be detemrined using phase information from 
to PapD/PapK or FimC/FimH smtomre coordinates, respectively. This mefhod will provide 
an accurate toee-dimensiona. strucmre for the unknown protein orprotein co-complex in to 
new crystal more quickly and efficiently ton attempting to defemtine such information ab 
initio. 

If an unknown crystal fonn has the san* space group as and similar cell dimensions 
to the known co-complex crystal form, then the phases derived from the known crystal form 
can be directly applied to the unknown crystal form, and in turn, an electron density m ap for 
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^ ^ ^ - "= «*— • ™*™ —» maps-can men ba 

ta. Ad.frerencee^ond^^isa.ub.ractoofoneelectand.nsi.^ap 
- denved from the known crystal fo™, from another elecmn de„ sily map. eg.. L 
denved from ft. u^cnown crystal fonn. Therefore, a„ similar ftan.es of me „vo e,e«ron 
denstty maps are eliminated in me subtraction and only me differences between the two 

ma. IS father compfexed * a marmose -„ 8 in me FimH mannosa binding site, then a 
dtfftrence daemon density map between this map and the map derived from the native 
»t»mplexe.d crysta, wi„ ideally show oniy the electron density of ^ differences between C 

"* * 6 mannose anal °S- Snarly, if amino acid side chains have different 
confonrtations in the two crystal forms, than those differences wH, b. Weighted by peaks 
frosmve electron density, and va„eys (negative elecfron density, in the difference elecrro, 
d-mstty map, making ma differences between the two ctysta, fonns easy to deftct However 
tf the space groups and/or call dime^ions of the two crysta! fonns are different, man this ' 
approach win no. work and molecufa replacement mas. ba used in order .o derivephaaas for 
the unknown crystal form. 

All of .he complexes referred .o above may bestudied using well-known X-ray ' 
dtfiracfon ,achni,uas and may be canned versus ,.5 A or higher to 3 A resolunon X-ray daft 
to an R value of abou. 0.20 or lass using computer software, such as X-PLOR (Yale 
Umversity, (c) m 2 , dis.ribu.ed by Mo,ecu,ar SimuMons, htc). See, eg., Blunde. a, a,., 
1976, Pro,„„ Geography, Academic Press,; Melhods in Enzymologv, vol H4&H5 
Wyckoffcr „,„ ads., Academic Press, 1985. This information may thus be used .o optimize" 
known classes of inh.bi.ona of PapD/PapK or FitnOFimH co-complex fonnation or of 
mannose binding to FimH, and more important*, ,o design and synthesize novel chases of 
rnhfenors of PapD/PapK or FimC/FhnH co-complex formation or of mannose binding to ■ 

Thes.ntcn.reco, 'mina.eaofPapD/PapKorFime/FitnHmuan.co^omplexeswiH : 
also factlitate the identification of related protein co-complexes analogous to me PapD/PapK 
orFunOFunH co-complexes in fiction, strucnare or bom, thereby ferthar leading to novel 
merapeunc modes for treating or preventing gram-negative bacteria-mediaed diseases 
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. . ° f * hC a '° miC 5 ' mam C0Mdi "«« «- — to any of .he above mahods. 

Parficularly uselul subsets of .he coordinates include, bu. „ „„ t Umi,ed to, coordina.es of 
angle domains, coordinates of residues Bring an active site, coordinates of residues drat 

example, the coordinates of one domain of a protein that contains the active site may be used 
to design inhibitors that bind to mat she, even .hough the protein is fully described by a larger 
« Of atomic coordinates. Therefore, as described in detail forfhe specific embodiments' 
below, a se. of atomic coordinates that define fire entire polypeptide chain, although useful for 
many applications, do no, necessarily need to be used for the methods described herein. 

USSS jjf subsets of a ,omir ennriin.^ fa 

The srxucmre coordinates of the present invention, and subsets ttereof, are useful for 
des.gnmg or screening for compounds tha! bind to the ?apD, P ap K, FimC or HmH proteins 

The hrgh resolution X-iay strucnues of the PapD/PapK and EimOFuttH co-compl«« of me 
present invention show details of the interactions between PapD and PapK, and between 
F,mC and FimH, respecdvely. This infonuafion can be used to deaign and/or scr«n for 
compounds du,bmd..o the sites. of inferacfion, thereby blocking co-edmple, fomrationand 
ptlus assembly. In addition, the X-ray structure of the FimC/FimH corcomplex has a C- 
HEGA molecule bound in the mannose-binding pocket of FimH, which can be used to model 
compounds that bind to fhe lectin domain and inhibit the FimH interaction with mannose on 
host cells. 

---™ ose ° fs ™^^^^ 

complex snucture coominates and fhe comp.ete se, of FimC/FimH complex structure 
coordinates win be useful in the methods of .he present invention. TW of skill in the an 
wdl further recogmze tha, the coordinates of PapD, PapK, FimC and FimH will be useful 
separate from , he coordina.es of fife profein whh which each protein forms a co-complex in 
me crystals. In addition, those of skill in ,he art will recognize fira, subsets of the sfnrcture 
coordma.es of each pro.ein, such as the coordinates of a single domain or interface or binding 
pocket will be useful in ,he methods of ,he invenfion, as discussed in more derail, below 

In one embodiment .he PapK coordinates, or the subset of PapK coordinates «ha. are 
me residues in the hydrophobic groove region of PapK (the Kl region), where the O, be.a- 



WO 01/10386 



I 

PCT/USOO/22087 



63 



strand 0 f PapD interacts « PapK in „ e co<orapIex CIys|a , ^ ^ fc 

des.gn.ng „d,„ r serening for compounds that bind „ te ^ fc ^ fc 
assembly. A subse, of stn.cn.re coordinaes of PapK ^ in this embodl . meill 
.nvention include ,hose of Val« Leu« Val- Phe - Phe - Vie™ 

In anomer embodiment, the PapD ccordina.es. or the subset of PapD coordinates that 

It ? r;T residues (,be D1 resion) - ww?h in,eracB '* '•■ ki * **• 

mfo the hydrophob.c poove of PapK in the PapD/PapK «,-comp.ex sbuebne, are useful for 

dea.gn.ngcompoundathathaveananalogousabape.suchtethecompoundsmintoa.e - 
PapK groove arrd inhibit pi lus ^embly. A subset of G, beta-strand structure coordinates of 
PapD useful ,n this embodiment include those of Leu"" 0 . Gta"»°, Ile'«*> Ala"*°and Leu"™ 
_ ^ In ye, another embodiment, the PapD coordinates, or a subset of PapD coordinates in 
*. W reg.cn, and the PapK coordinates, or a subset of PapK coordinates in the K2 region 
whrch pardcipate in a second interface ofthe PapD/PapK co-complex, arc useful for ' 
des,gmng and/or screening for compounds tha, disrupt this interaction and prevent PapD- 
PapK co-complex formation, A subset of PapK coordinates useful for this embodiment of the 
mvenhon mclude those of residues Val»«, Gl/«, Lys- and Arg"». A subset of PapD 
coordinates useful for this embodiment of me invention include those of residues Thr'»° 
Ile-Glu-Glu-TV-Ile-andArg- ' . 

In another embodiment, tie FimH coordinates, a subset of me FimH coordinates ma, 
are me p„m domain of FimH, or a subse, of FimH coordinates tha, are me residues in me 
hydrophobic groove region of the pilin domain, where me G, beta-strand of FimC internes 
!* FlmH ' ™ UMa " fM ***** -*» •««*« for compounds ma, mhibi, mis 
mtemction, thereby inhibiting P i,us formation in lyP e , pill. A subset of FimH structure 
coor mates useful in mis embodiment of , he invention include those of residues Ah*. 
Attn"", Val'»» and Val"*, as numbered in Fig. 8. 

In ye, another embodiment the PirnC coordinates, or a subse, of FimC coordinates 
that are ,he tesidues ofthe G, beta-strand tha, interact with the hydrophobic groove region of 
F.mH are useful for designing amvMni , ,„„ ^ m ^ ^ ^ ^ 

compounds fi, into the FimH groove and inhibit type , pi ,us assembly. A subse, of FimC 
Sfrncture coordinates useful in this embodiment ofthe invention include those of r= S ,dues 
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lie ,03C ,Leu ,osc andIle ,a7C . 

In another embodiment, the FimH coordinates, a subset of FimH coordinates that are 
the lectin domain of FimH, or a subset of FimH coordinates that comprise the mannose 
binding pocket of the lectin domain are useful for designing and/or screening for compounds 
that fit into the mannose binding pocket and block the interaction of FimH with host cell 
mannose oligosaccharides, thus preventing adhesion to host cells and E. coli pathogenesis. A 
subset of structure coordinates useful in this embodiment of the invention include those of 
residues Phe ,H , Asn 46 ", Asp 47H , Tyr 48 ", lie"", Asp 54H , Gln ,33H , Asn ,35H , Tyr' 3 ™, Asn' 3SH , Asp 140 " 



andPhe' 42 " 



The following examples illustrate the invention, but are not to be taken as limiting the 
various aspects of the invention so illustrated. 



EXAMPLES 

Example 1 : The PapD-PapK ChaptrnnP-Suhiiiiif Cn.rnmpi>v 
Expression of the PapD-PapK Co-Complex. The PapD^PapK co-complex was 
overexpressed in E.coli and periplasms were prepared as described by Slonimetal. {EMBOJ. 
1992, 11:4747). Periplasms were then subjected to cation exchange (15S Source 
(Pharmacia)) followed by hydrophobic interaction (15PHE Source (Pharmacia)) 
chromatography to yield pure co-complex. Expression of selenomethionine (Se-Met) PapD- 
-PapKco-cbmplexeswas^^ 

described by Hendrickson etal. (EMBO J. 1990, 9:1665) and purified as was the wild-type : 
co-complex. The purified wild-type or Se-Met PapD-PapK co-complexes were dialyzed ,, 
against 20 raM KMES pH 6.7 and concentrated to -12 mg/ml. Co-crystals were grown by 
vapor diffusion using the hanging drop method against a reservoir containing 10-15% (w/v) 
PEG 6000, 100 mM potassium acetate, and 200-400 raM sodium acetate at pH 4.6 [A. 
McPherson, Eur. J. Biochem. 189, 23 (1990)] and appeared within three to five days. The 
co-crystals were cryoprotected by increasing the concentration of PEG 6000 to 25% (w/v) 
and flash-cooled to liquid nitrogen temperature. Co-crystals were in space group P2 1 2 l 2„ 
with cell dimensions a = 62.12 ±0.2 A, b = 63.69 ± 0.2 A, and c = 92.72 ± 0.2 A, and with 
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one co-complex in the asymmetric unit. Table 4 contains a summary of the data collected 
refinement statistics. 
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A complete data set to a resolution of 2.7 A was collected in the laboratory setting 
• (Rigaku Raxis IV image plate mounted on a Rigaku RU200 rotating anode X-ray generator) 
using an oscillation range of 1.5° and exposure time of 45 mm/frame ("Native" data set in 
table 4). Se-Met PapD-PapK co-crystals were in the same space group with the same cell 
dimensions. Once cooled, these co-crystals diffracted to slightly higher resolution in the 
laboratory setting and a complete data set ("Se-Met Single'' in Table 4) to a resolution of 2.5 
A was collected (2.5E oscillation range, 60 mm/frame). These co-crystals were also used to 
collect MAD data at the National Synchrotron Light Source at Brookhaven National 
Laboratory (Beamline X4A). Complete data sets at four wavelengths to a resolution of 2.4 A 
were collected ("Se-Metl-4" in Table 4). All data were reduced and processed using the, 
programs DENZO and SCALEPACK [Z. Otwinoski, in Proceedings of the CCP4 Study 
Weekend, L. Sawyers, N. Isaacs, S. Bailey, Eds. (SERC Daresbury Laboratory, , 
Warrington, 1993), pp. 56-62]. 

StnictureofPapD-PapK co^complex. The structure of the PapD-PapK co-complex 
was solved using MAD phasing [W. A- Hendrickson, Science 254, 51 (199.1)]. The PapD- 
PapK co-complex contains three methionines, all of which are in PapD, at positions 18, 66, 
and 172. The "Native" and "Se-Met Single" data sets were first used to generate a difference 
Patterson map using the program HEAVY [T, C. Terwilliger and D. Eisenberg, Acta 
Crystalldgr. A39, 813 (1983)] where strong peaks could be readily located.. Three heavy 
metal positions were ^ 

Eisenberg, Acta Crystallogr. A43, 1 (1987)]. Initial SIRAS-solvent flattened phases were, 
however, insufficient to build a model of PapK. Subsequently, multi-wavelength anomalous 
diffraction (MAD) data were collected (Table 4).' After local scaling using the high energy 
remote wavelength ("SeMet-4" in Table 4) as the reference wavelength, MAD phases were 
calculated using SHARP [E. De La Fortelle and G. Bricogne, Methods Enzymol. 276, 472 
(1997)]. An interpretable electron density map was readily obtained after density 
modification by solvent flipping (program SOLOMON [J. P. Abrahams and A. G. W. Leslie, 
Acta Crystallogr. D52, 32 (1996)]). The PapD subunit was rebuilt into the experimental 
electron density, starting from the apo-PapD structure. A C Q trace of the PapK subunit was 
built into the experimental electron density map using program O [T. A. Jones and S. Thirup, 
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EMBOJ. 5, 819 (1986); T. A. Jones, J. Y. Zou, S. W. Cowan, M. Kjeldgaard,^ 
Crystallogr. A47, 1 10 (1991)], accounting for all but 8 residues located at the NH 2 -terminus 
for which, even at later stages of the refinement, no electron density was observed The 
electron density was of sufficient quality (Fig. 1) to unequivocally assign the sequence. The 
model was then refined using CNSsolve 0.5 [A. T. Brunger et a!., Acta Cystallogr. D54 905 
(1998)] against the «SeMet-3' structure factor amplitudes using the maximum likelihood 
refinement target with incorporation of experimental phase information [P. D. Adams N S 
Pannu, R. J. Read, A. T. Brunger, Proc. Natl. Acad. Sci. 94, 5018 (1997); N. S. Pannu' G N 
Murshudov, E. J. Dodson; R. J. Readme* Crystallogr. D54, 1258 (1998)]. Both positional 
and emulated annealing refinement in cartesian space were used (the temperature factors 
were set to 25 A 2 ) and resulted in values of R, and free-R of 27.4 and 32.5 %, respectively [A 
T. Brunger, J. Mol Biol. 203, 803 (1988)]. After two rounds of rebuilding, where simulated 
annealmg omit maps were generated for ambiguous regions and used to adjust the model [A. 
Hodel, S.-H. Kim, A. T. Brunger, Acta Crystallogr. A48, 851 (1992)], positional refinement 
followed by restrained refinement of the temperature factors resulted in a model with R and 
free-R values of24.3 and 28.8%, respectively. At this stage, 104 well-defined water 
molecules were added resulting in a final model with R- and free-R values of 23. 8 o/o and 
27.40/0, respectively. The stereochemistry of the model is excellent and the temperature 
factors restrained appropriately (Table 4). The model of PapK is complete between residues 
9andl57. Electron density was poor for residues 21 6 to 2 18 ofPapD and therefore, this 
region was not included in the final model. Also, for the same reason, residues Arg^ and 
Glu" in PapD were built as alanines. All residues in PapK and PapD are located in either the 
most favored or the allowed regions of the Ramachandran plot [G. N. Ramachandran and V 
Sas,sekharan, Adv. Protein Chem. 23, 283 (1968)]. Coordinates have been deposited at the 
Protein Data Bank (entry code IPDK). 

COOH-terminally truncated Ig fold of PapK. PapK has the same overall variable- 
region immunoglobin-like (Ig) fold as the amino-terminal domain of PapD, with two beta- 
sheets coming together in a beta-sandwich (Figs. 2A and 3A; see also Fig. 2A for secondary 
structure notation). However, the Ig fold of PapK is incomplete: it lacks the COOH-terminal 
seventh strand, G, which in canonical Ig folds forms an antiparallel beta-sheet interaction 
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with strand F and contributes to the hydrophobic core of the protein. Remarkably, in the 
PapD-PapK co-complex, this missing strand is provided by PapD, which donates its G, beta- 
strand to complete the Ig fold of PapK (Figs. 2A, 2B, and 3A). The Ig fold thus produced is 
however atypical, since the donated strand runs parallel, rather than antiparallel, to strand F in 
PapK. The insertion of the G, beta-strand into the fold of the pilin, coined as "donor strand 
complementation" has important implications for the mechanisms of subunit folding, capping 
and assembly. 

The first eight NH 2 -terminal residues of PapK are disordered. The Ig fold of PapK 
. (Fig. 3A) begins with a short beta-strand, Al, which makes typical antiparallel hydrogen . 
bonds with the COOH-terminal residues of strand B. This short beta-sheet arrangement is 
interrupted by the insertion of a 3 10 helical turn (Figs, 2A and 3B) which results in strand A 
switching sides in the beta-sandwich in order to make antiparallel beta-strand interactions 
with the G, beta-strand of the chapcrone (Fig. 3A). Strands A and B are connected by a short 
-helix ( B in Figs. 2A and 3B) which precedes three successive sromatic residues (Phe", 
Trp", Tyr 37 , Fig. 3B). While Phe 35 inserts into the hydrophobic core of the beta-sandwich, • 
Trp 36 and Tyr 37 interact closely with residues at the COOH-terminus of helix D (Fig. 2A), 
possibly contributing to its stability. Strand B forms the edge of one of the two beta-sheets in 
the beta-sandwich and runs antiparallel to strand E Following strand B, the structure crosses 
over to the other side of the beta-sandwich through a short 3 10 helix (Fig ; 2A) to form strand 
CI, which runs antiparallel to strand F. The COOH-terminus of strand CI deviates from the . 
beta-sheet arrangement to form a protruding beta-meander (strands C and C"). Strand C 
reaches ovWtoi&othcf&e ofthe beta^andwich to form main-chain hydrogenbonds with - 
strand Dl. This small beta-structure eventually returns, as C2, to make main-chain hydrogen 
bonding interactions with strand F (Figs. 2A, 3 A, and 3B). 

An extended loop links strand C to strand Dl on the other side of the beta-sandwich. ' ' 
Strand D constitutes an edge of the D, E, B, Al beta-sheet. It therefore runs antiparallel to 
strand E. However,, strand D is divided in the middle by an insertion which meanders 
towards the C\ C" meander and reaches back to the E strand. Strand E is followed by a 
three-turn helix ( D) and a long loop structure which connects it to the COOH-terminal strand 
F. Finally, strand F, from Asp 145 onward, forms a parallel beta-sheet with strand G, of PapD 
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(Figs. 2A and 3A). Hence, strand G, of PapD is an integral part of the C, F, A2 beta-sheet of 
PapK. 

Structure of PapD in the PapD-PapK Co-complex. Except for the F.-G, loop in the 
NHj-terminal domain (Figs. 3C and 4), the structure of PapD in the PapD-PapK co-complex 
superimposes very well with apo PapD (r.m.s. deviation in C atom positions, excluding the 
F,-G 1 loo P ,of0.65A). Hence, the binding of PapK does not alter the orientation of the 
domains of PapD. The major difference between the apo and PapK-bound forms of PapD is a 
large conformational change in the F.-G, loop of PapD. The tip of this loop undergoes a flap 
motion of about 1 1 A that results in an re-ordering of the F,-G, loop such that residues 101 to 
105 of PapD become part of the G, beta-strand. 

The PapD-PapK interface. The total buried surface area in the PapD-PapK co- 
complex is 3434 A\ There are two distinct sites on PapK that interact with two 
corresponding sites on PapD. Site Kl of PapK interacts with a site on the NH^terminal 
domain (domain 1) of PapD (site Dl) and site K2 of PapK interacts with a site on the COOH- 
terminal domain (domain 2) of PapD (site D2) (Fig. 5). 

Site Kl contains a deep groove which runs the length of the subunit. The edges of the 
groove consist of strands A and F and its base is formed by the hydrophobic core of PapK 
(Figs. 6A, 6B and 6E). This groove is the result of the missing G beta-strand in the Ig fold of „ 
PapK. Site Dl includes residues 101 to 1 12 of the G, beta-strand of PapD, which insert into 
the Kl groove and make a beta-zipper interaction with strand F of PapK on one side of the 
groove. Residues 101 to 105 also make a beta-zipper interaction with strand A2 on the other 
side of the groove (Figs. 6A and 6B). Insertion of the G, beta-strand also results in the 
formation of a continuous 5-stranded beta-sheet which includes strands C„ F„ and G, of 
PapD and F and CI of PapK (Fig. 2A). The alternating hydrophobic residues in the G, beta- 
strand of PapD (Leu 103 , He' 05 , and Leu'-) interact with the hydrophobic base of the groove 
(Fig. 6E). Thus the donor strand complementation by the G, beta-strand of PapD shields the 
hydrophobic core of the pilin from exposure to the aqueous milieu of the periplasm. 

The Kl-Dl interaction also involves contacts at the end of the groove nearest the cleft 
of the chaperone. These interactions consist of hydrophobic and polar contacts between the 
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Ai strand of PapK and the Al „ A2, and C, strands of PapD (Figs. 6A and 6B). The COOH- 
terminal carboxylate of PapK anchors the subunit into the cleft of PapD by hydrogen bonding 
to the invariant Arg 8 and Lys' 12 residues o f PapD as well as to the Oy hydroxyl of highly 
conserved Thr 152 (Figs. 6C and 6D). 

Site K2 is formed primarily by residues in helix 3 I0 C and the COOH-terminal Arg 157 - 
side chain of PapK (Figs. 6C and 6D). This interface is less extensive than site Kl (455 A 2 ). 
Residues in site K2 interact with residues in the C 2 and D 2 strands and with the F 2 -G 2 loop of 
domain 2 of PapD (Site D2). The K2-D2 interface includes hydrogen bonds between Thr 57 of 
PapK and the main-chain carbonyls of Glu" 4 and Glu 165 of PapD, as well as polar and 
hydrophobic contacts involving Lys 61 and lie 62 , of PapK and Arg 200 and lie 154 of PapD. 

Example 2: Preparati on and comparison of Fim A subnnits 
from different strains of ET coJL 

Genomic DNA was prepared from overnight broth cultures of 59 uropathogenic £ 

colt strains using the Puregene DNA Isolation Kit (Minneapolis, MN). DNA was amplified 

by PCR using Taq polymerase (Perkin Elmer) using the followingprimers: 5'- 

- GATGGGTGGGAGAGGAAGGAGG-3 ' (SEQ ED NO: 53) and ~ — — 

5 '-GTTGGTATGACCCGCATCAATCGC-S ' (SEQ ID NO: 54) that flank theJfwA locus, 

* under the following conditions : cycle 1 (95°C for 1 min ), cycle 2-30 (95°C for 30 sec, 50°C 
for 30 sec, 72°C for 2 min.) in the presence of 3.0 mM MgCl 2 . The F/wA amplified 
fragments were purified with a QIAquick Purification Kit (Qiagen, Germany), sequenced 

_4k5P±iyjyi* 

Elmer, Norwalk, CT) and analyzed on the ABI 373 Automated DNA Sequencer (PE Applied 
Biosystems, Foster City, CA). The F/mA sequences were aligned and compared using the 
Lasergene software program (DNAStar). 

Example 3: Structure of FimH in the FimH-Fim Q Co-Crvsral 

FimH is folded into two domains of the all-beta class. The NH r termirial mannose r 
binding domain comprises residues 1H - 156H, and the COOH-terminal pilin domain which 
is used to anchor the adhesin to the pilus comprises residues 160H - 279H. A short extended 
linker (residues 157H - 159H) connects the two domains. FimC in the co-complex has the 
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same overall structure as free FimC The pilin domain of FimH binds in the cleft of the 
chaperone, but mostly to the chaperone's NH r terminal domain. 

The lectin domain of FimH is an 1 1 -stranded elongated beta-barrel with ajelly roll- 
like topology (Figure 8B). A pocket capable of accommodating a mono-mannose unit is 
- located at the tip of the domain, distal from the connection to the pilin domain (Figure 9B) 
The bottom of the pocket is lined with asparagine, glutamine and aspartic acid residues in 
three loop regions which are typical carbohydrate binding side chains (Figure 10A) A 
molecule of cyclohexylbutanoyl-^hydroxyethyl-D-glucamide (C-HEGA) is bound in this 
pocket. C-HEGA is not a known inhibitor of FimH mannose binding but was needed in the 
crystallization to produce useful co-crystals of FimC-FimH co-complex. The glucamide 
moiety of C-HEGA is blocked at CI and cannot form a pyranose, but is bent to approach the 
pyranose conformation. The C2, C3, C4 and C6 hydroxyl groups of C-HEGA are enclosed 
wuhin the pocket, whereas the C5 hydroxyl and cydohexylbutanoyl-Mhydroxyethyl groups 
point out from the pocket and are solvent exposed. Residues Asp"«, Gin 133H , Asn ,33H , Asp'™ 
and the NH 2 -terminal amino group of FimH (Figure 10A) are hydrogen bonded to the 
glucamide moiety of C-HEGA. FimH from-a urinary tract*, coli isolate whichhas a lysine 
mstead of asparagine at position 135H produces type 1 pili but is unable to mediate mannose 
sensitive hemagglutination of guinea pig erythrocytes (S. Langermann, unpublished results) 
Also, a mutation at residue 136H has been reported to completely block mannose binding. 
See Schembri et al., FEMS Microbiol. Lett., 137, 257 (1996). 

The pilin domain of FimH has the same immunoglobulin-Iike topology as the NH r 
terminal domain of FimC, except that the seventh strand of the fold is missing. Two an* 
parallel beta-sheets (strands A'BED' and D»CF) pack against each other to form a beta-barrel 
that ,s similar to, but distinct from, immunoglobulin barrels. As in the chaperones, strand 
switching occurs at the edges of the sheets. In the chaperones, the Al strand of the NH,- 
terminal domain switches between the two sheets of the barrel. The first strand of the pilin 
domain exhibits a similar switch, but due to the lack of a seventh strand, the second hal f of 
the A strand is not involved in main chain hydrogen bonding within the domain. The D 
strand of the chaperones as well as of the FimH pilin domain also switches, but in the pilin 
domain the switch is an 8-residue loop instead of the cis-proline bulge found in the 
chaperones. The C-D loop and the D'-D" connection pack against each other and close the 
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Due ome^nceofasevenfl,^,^ sariscrratedon|I , rf ^ 

.Kcsmues toy* be partof fte hydrophoMc ^ ^ - 

domam ,„s(ead to, a deep hydrephobic erevice „„ (he surface of (he p iU „ domain 
Examples Fimr-FimHr-^ .,,-! ^ ,,,.,^. .... 

2 oZ^ U ' i0 : ! 4 ^ ° f FimC " FtaH CO - COmPleX * ' 300 
T^cTf " " a * mk S °"" i0n "* ' M ~ » «** in 0.1 M 

T <™** ~ - were arranged 

aa two sera of four mo.ecu.es re,a(ed by approbate 4 , screw axe, E,ec(ro„ de^iry was 

. exce! en, for one se, of molecmea (Figure 9A>, aiEwing applied t0 (race (ite attire ^ 

comptex. For the second ae, ofmoiecu.es, e.ecnbn density was poorer but aflowed for 

unamb.guous placement of a copy of (he initially traced co-complex 

H^n,^ca254,51 ( l W1))(lataonBM14oftlleESRF Data were recorded 2t 
each of 3 wave.eng.hs corresponding ,o me peak of mo Se white ■inc. me point of inflexion of 
me K absorpcon edge, and a remote waveleng* uaing a MAR CCD detector. Da,a were 
reduced using (he program hkljooo (? . 0 (winowaki and W. Mto-tofa* („ 

PP. 307), wtfl, further processing and scaling using (he CCP4 processing package (CCP4 
Ada, Crysf.DSO, 760 (1994)). ok 5 n-v-rt, 

cefld- ■^."^^^™*««^b^ l . tt »^ wC2 ^ 
=U dtmenamns a - ,39.03 ± M A. b - ,39.03 * 0.2 A, c -2,4.49 ± 0.2 A, and beta-3,97 
* 0.2 A. The co.crya,a ls exhibi. attong pseudo P4,2,2 synuncry. An inifla, soteion ,„ the 
Patterson function waa produced in (he (ettngona, pseudo apace group both automaucafly 

<ZT 7 srm sotVE <r c Tem '" iser - d '■ Bere " d -- ™. 

0997)) and manual* using me program rsps (s . ^ , ^ 

*rf 2,5: „3 (,990„, and inina, phases cafcuia.ed using SHARP (E. de iaFonefle and 
BnC °^' " fe >™^ C. W. Caner, R. M. Swee., Eda. (Academic Press 
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New York, 199% vol. 276, pp. 472)). Density modification including 4-f.rd non- 
crystallography (NCS) avenging was done using the program DM (K. D. Cowtan, Joint 
CCP4 ESF-EA CBM Newsl. Protein CrystaUogr. 31: 34 (1994,). A model corresponding to 
the two copies of the co-complex in the psettdo asymmetric unit was built using O (T A 
Jones et 1.. Acta. Cry,,. A47, , ,0 (1991)) modeled in 4-fold averaged elect™, density and 
refined against 2.5 A native data applying tight non-crystallographic restraints. The crystals 
are a etther space group P4,2,2 or P4,, with cell dimensions a - b - 97.7 * 0.2 angstroms and 
c-215.9* 0.2 angstroms. Bulk solvent cotrection, positional, simulated annealing and 
tsotroptc temperamre factor refinement has been earned out using X-PLOR (A. T Bmnger 

^ORMonual^ersionS.I^As^mforX-raycrystanoraphyandNMRm^ 
Untversity Press. New Haven, CT. 1993), and REFMAC (G. N. Murshudov, A. A Vagin E 
!■ Dodson, Acta. Cry,, D53, 240 (1997), with tight NCS restraints against a 2.5 A native data 
set collected at Max D/BL71 1 in Ltmd. The current R-factor and R-free (on JS of the data) 
are 24.0% and 26.8%, respectively. The r,m.s. deviations from ideal bond length and angle 
values are 0.016 A and 3.3°. respectively. Mo residues are found in disallowed regions of the 
Ramachandran plot. The coordinates have been deposited at tire Research Collabortory for 
Structural Btoinfomtatics Protein Data Bank (code IQUN). ■ 
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Example 4: FimCFimH Cn.r nnm](lY structure 
In the FimC-FimH co-complex, the seventh strand (G, beta-strand) from the NH,- 
terminal domain of the FimC chaperone is used to complement the pilin domain by beina 
inserted between the second half of the A strand and the F strand of the domain (Figure IOC) 
, Thus, the final strand (F) of FimH forms a parallel beta-strand interaction with the Gl strand 
of FrmC and has its COOH-terminal carboxyl group anchored in the crevice of the chaperone 
cleft through hydrogen bonding with the conserved residues Arg« c and Lys" 2C in FimC 
(Figure 9A). 

The G, beta-strand of the FimC chaperone contains a conserved motif of solvent 
exposed hydrophobic residues at positions 103, 105, and 107. In the FimC-FimH co- 
complex, these residues are used to complete the unfinished hydrophobic core of FimH 
_ ^^ Ure ^P)' ^ e tVf ° res !^ ues . Ii e y. IWC a nd„Leu l05C are deeply buried in the crevice created in 
the FrmH prhn domain due to the missing seventh strand. Ile"» c is somewhat closer to the 
domain surface but makes van der Waals contacts with residues Val' 65H and Phe 276H Leu"» c 
contacts residues lie'- VaP", Leu 22 " and lie 2 ™. Leu'- is in contact with He'-' Leu'- " 
Leu^ Demand VaP- ^ 

to emphasize the fact that the pilin domain is incomplete and that the chaperone donates its 
Gl beta-strand to complete the fold of the pilin. 

Example 5: Suhunit-siihM nit mier»<-ti n „ s m Tvp g 1 Pii; 

residues in the two conserved motifs (the COOH-terminal F strand and an NH r terminaI 
motif) participate in subunit-subunit interactions necessary for pilus assembly. See G E Soto 
etal.,£A/50J., 17:6155(1998). An alignment ofthe pilin sequences, based on the FimC- 
FimH co-crystal structure, revealed that the NH 2 -terminal motif was part of a 10-20 residue 
NH 2 -terminal extension that was missing in the FimH pilin domain (Figure 8A). This region 
contains a highly conserved pattern of alternating hydrophobic residues (highlighted in Figure 
8A) similar to the donor O, beta-strand of the chaperone. This motif is structurally analogous 
to the Gl donor strand motif of the chaperone and molecular modeling indicates that it would 
be able to fit into the same groove occupied by the donor G, beta-strand of the chaperone. 
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Thetype 1 ^^bmM^^^i,^^^^^^ 
approximately 70 A, a central pore of about 20-25 A, and a rise per subunit of about 8 A In 
order to obtain this structure, insertion of the ^-terminal extension must be antiparallel to 
strand F in contrast to.the parallel insertion observed for the G, beta-strand of the chaperone 
Insertion m a parallel orientation would lead to rosette-like structures. One edge of the pilin 
groove is lined by the COOH terminal F strand which has been shown to form a critical part 
of the subunit tail. Thus, the ^-terminal extension represents the head of a subunit and 
dunng pilus biogenesis, it would displace the donor G, beta-strand of the chaperone to fit into 
the taU groove of a neighboring subunit and to complete the pilin fold of its neighbor in a 
donor strand complementation mechanism. 

Using the FimH pilin domain as a model for FimA, applicants constructed a model for 
the type 1 pilus that fit these data (Figure 1 1). Each subunit was aligned to have its cleft 
facmg towards the center of the pilus so that the height from the top to the bottom of the 
domain along the helix axis was approximately 25 A. Applying a rotation of 1 1 5 degrees and 
a nse per subunit of 8 A, a hollow helical cylinder is created. The outer diameter of this 
cylinder as measured across C a atoms is 70 A, and the inner diameter is 25 A. FimA subunits 
from different strains of £ call exhibit considerable allelic variation. The vast majority of the 
vanable positions are on the outside surface of the pilus model proposed above (Figure 1 1) 
which would account for the antigenic variability of type 1 pili. 

The proposed head-to-tail interaction between subunits in a pilus is reminiscent of 
ohgomenzation through three-dimensional domain swapping in the sense that a part of the ~ 
molecule is used to complement another. However, in this case, complementation occurs not 
only between identical protein chains (FimA in the pilus rod) but also between homologous 
but distinct chains e.g., FimG, FimF and FimH in the pilus tip. Furthermore, because 
indmdual pilins promoters do not exist as stable monomers, there is no exchange of 
structural units between a monomeric and an oligomoeric state. Instead, a different protein 
the penplasmic chaperone, is needed to keep the monomeric subunits in solution by donatin* 
a unique part of its structure (the G, beta-strand) to the different subunit grooves. 

Based on the structure of the FimC-FimH co-complex, pilins are missing the 
necessary steric information needed to fold into a native three dimensional structure. The 
information that is missing consists of the seventh edge strand of an immunoglobulin fold 
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V " U to the hydrophobic coreof Ihepihnby 

*. penplasmic chaperone taadonorstrond conwlementalion mechanism! Thus, * s ,eric 
^formation necessary for newly synthesized protein chains ,o foid correctly is no, injreren, in 
the sequence of .he protein to be folded; however, such information is instead transferred 
from another protein, the periplasmic chaperone. 

Eaan.pie« ; FimHRindi..., n irt mf nr n Fin , r . hv| ; , II!irt ^ 

The ability of FintH tobindto peptides corresponding to the G, beta-strand of FimC 
and the N-terminal extension of FimG was jested using an ELBA assay. During pitas 
assembly, the G, beta-strand of FimC completes the Ig fold of the FimHpilin domain in the 
pertplasm and then in the pitas the N-tetminal extension of FimG completes me Ig fold of foe 
FimH pilin domain. 

...... In oroer-to assess the ability of FimH *Wmte<«o P ^te,im&^ sm ^ 

from the FnnC-FimH co-complex. Synthetic peptides were synthesized copending to foe 

sequences are as follows: FimC peptide, NTI.QLAIISR (SEQ ID NO: 55) and FimG peptide 
DVTITVNGK (SEQ ID NO: 56). Stock solutions of the peptides (5 mg/ml) were dissolved 
mDMSO. , r . 

The peptides were diluted in phosphate buffered saline (PBS) (120 mM NaCI 2 7 mM 
KC1, lOmM, 10 mM PBS, P H 7.4) to 2 nmol/SOpJ. FimC protein was diluted to 0.1 
nmol/50ul and coatedj>vernight onto microtiter wells with 50 ul/wetl at 4«C. The ELISA ^ 
assay was carried out as described in Kuehnetal, 1993 and Hung et ah, 1996. Briefly the 
wells were washed three times with PBS and blocked with 3% Bovine Serum Albumin' 
(BSA) ,n PBS for two hours at 25'C. Then the wells were washed three times with PBS The 
FimC-FimH co-complex was incubated in 3 M urea to separate the two proteins. Pure FimH 
in 3 M urea was collected from the flow through of a Source 15S. column (Pharmacia) See 
Bamhart et al., PNAS USA 97: 7709-7714 (2000). The wells were incubated with 50^ of 
FimH in 2% BSA-PBS diluted to 5-25 pmol/well FimH for 45 minutes at 25°C. The wells 
were washed 3 times with PBS followed by incubation with a 1:1000 dilution of mouse anti- 
F:mH antibodies in 3% BSA-PBS for 45 minutes at 25»C. The wells were washed 3 times 
w,th PBS followed by incubation with a 1:1000 dilution of goat antiserum to mouse I gG 
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(S,gma) conjugal to alkaline phosphatase diluted in 3% BSA-PBS for 45 minutes a. 25'C 
The weUs were washed 3 toes with PBS and washed 3 rimes wi«h deve>opi„ g buffer (.0 mM 
otettanolamme^mMMgC,,). The ELISA was developed by adding 50pl of substrate 
<50u, of fflteted , ^ p-„i tro p h en yl phosphate. Sigma) in developing buffer. The reaetion 
was tncubated for 1 hour a, 25'C in the dark and the absorbanee a. 405 m was rea d 

The competition assays were earned ou, simi.arly. FimC was eoated onto mierotiter 
we Is a, 0.. nmot/well. FimH a, 5 pmol/weU to 3% BSA-PBS was added to the FimC eoated 
wells ,n the presence or the absence of the FimC or FimG peptide a. 2 nmol/well or the 
mdieated peptide concentration. Further, increasing concentrations of FimH were incubated 
warn constant concentmtiona of the FimC or FimC peptides or the FimC protein immobilized 
a rn.cron.er wells. FimH hound weU to bom pure FimC ptotein immobilized on mictotiter 
wells (Ftg. 1 2) and to me peptides corresponding to the G, beta-strand of FimC and me N- 
.emttnal extension of FimG (Fignre .2). Nex, the ability of the peptides to inhibit FimH 
btndtng to FimC was tested. FimH was added to the FimC coated wells in the presence or 
absence of peptides to FimC or FimG. Increasing concentrations of the FimC peptide further 
ecreased the ability of FimH to bind to FimC immobilized on mierotiter wells (Fig 13) 
The FtmC peptide inhibited the ability of FimH to bind to FimC immobilized on the ' 
mtcrotiter we,,s (Fig. 14); however, the FimG peptide a, the tested concentration did not 
inhibit the ability of FimH to bind to FimC (Fig. 14). 

. Other features, objects and advantages of toe present in vention wil! be apparenl to 
toose skilled to toe art. The explanations and illusions presented herein are intended to 
acquatn. others skihed in the art with the invention, its p„„ci p ,e S , ana its practical 
application. Those skilled in toe art may adapt and apply the invention in its nutnetous 
fomts as may be best suited ,. toe requirements of a particular use. Accordingly, toe speeiftc 
embodtments of toe present invention as set forth are no, intended as being exhaustive or 
limiting of the present invention. 
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We claim: 

1 . An isolated compound which binds to a pilus subunit groove thereby inhibiting pilus 
- assembly. ■ 

2 - The compound of claim 1 wherein the compound is a peptide. " 

3. The compound of claim 1 wherein the compound is a non-peptide compound. 

4. The compound of claim 1 further comprising a mimic of an amino-terminal motif of a 
• Pilus subunit with at least two alternating hydrophobic amino acid residues which 

mimic exhibits antibacterial activity against a Gram-negative bacterium. 

5. The compound of claim 1 further comprising a mimic of a chaperone G, beta-strand 
with at least two alternating hydrophobic amino acid residues which exhibits " 
antibacterial activity against a Gram-negative bacterium: . 

6. The compound of any one of claim 1-5 wherein the compound has been modified to 
improve binding, specificity, solubility, safety or efficacy. 



7. 



The compound of claim I which is a 10 to 20 residue peptide or peptide analog 
according to formula (I): ~ ~ ; - — - 

(I) Z I ~Z 2 -X 1 -X 2 -X 3 -X 4 -X 5 -X S -X 7 -X 8 -X 9 -X 10 -Z J ~Z 4 

or a pharmaceutically-acceptable salt thereof, wherein: 
Z, is R-C(0)-NR- or RRN-; 

Zj is an optional 1 to 5 residue peptide or peptide analog; 
X, is any amino acid residue; 
X, is any amino acid residue; 

X 3 is a hydrophobic residue or a hydroxyl-substituted aliphatic residue; 
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X 4 is any amino acid residue; 

X 5 is a hydrophobic residue or Gly; 

X 6 is a hydrophobic or a hydrophilic residue; 

X 7 is Gly, an amide-substituted polar residue or a hydrophobic residue; 

X B is any amino acid residue; 

X, is an aliphatic residue; 

X, 0 is any amino acid residue; 

Z 3 is an optional 1 to 5 residue peptide or peptide analog; 
Z 4 is -C(0)OR or -C(0)NRR; 

each R is independently hydrogen, (C.-Q) alkyl, (C r Q) alkenyl, (G 2 -C 6 ) 
alkynylor(C 6 -C M ) aryl; 

each between residues X, through X IOl Z 2 and X, and X 10 and Z 
independently represents an amide linkage, a substituted amide linkag 
of an amide likage; and 

each "~" represents a bond. 

The compound of claim 7 wherein said compound further comprises one or more 
features selected from the group consisting of: 

each between residues X, through X 10 , Z 2 and X, and X l0 and Z 3 is an 
amide linkage; 

Z^isH^s 

Z 4 is -C(0)OH or a salt thereof; 

optional Z 2 is not present; 

optional Z 3 is not present; 

X, is other than a basic residue; 
X 2 is other than an aliphatic residue; 
X 3 is an aliphatic residue or T; 
X 4 is other than an acidic residue; ' 
X 5 is an aliphatic residue, F or G; 
X 7 is G, N or A; 

X 8 is other than an aliphatic residue; and 
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X, 0 is an aliphatic or a polar residue: 

The compound of claim 8 which is selected from the group consisting of SEQ ID NO- 
2, SEQ ED NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7 
SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12 
SEQ ID NO: 13. SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 'l7, 
SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 2 1, SEQ ID NO: 22,' 
SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27,' 
SEQ ID NO: 28 and SEQ ID NO: 29. ' 

The compound of claim I which is a 7 to 17 residue peptide or peptide analog 
according to formula (II): 

' (H) z ir z 12 -x 1I -x, J -x I 3-x I4 -x I3 -x I6 -x I7 -z 1J -z l4 

or a pharmaceutically-acceptable salt thereof, wherein: 
" - -T-ZiV is^^CO^NR'-dr R'R'N-r : ;c : ' ; : •;• / . : '"r ~ ; 

Z I2 is an optional I to 5 residue peptide or peptide analog; 
X,, is any amino acid residue; 

X a is any amino acid residue; 
X, j is a hydrophobic residue; 

X, 4 is any amino acid residue; . _ : _ _ .1 _.'.„■_ . 

X,, is a hydrophobic residue; 
X 16 is any amino acid residue; 

X, 7 is hydrophobic residue or a hydroxyl-substituted aliphatic residue; 
Z, j is an optional 1 to 5 residue peptide or peptide analog; 
Z 14 is -C(0)OR' or -C(0)NR'R'; 

each R' is independently hydrogen, (C,-C 6 ) alkyl, (C-Q) alkenyl, (C r C 6 ) 
alkynylor(C 4 -C 14 )aryl; 

each "-» between residues X„ through X 17 , Z l2 and X„ and X l7 and Z 13 
independently represents an amide linkage, a substituted amide linkage or an isostere 
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I. 



of an amide Iikage; and 

each independently represents a bond. 

The compound of claim 10 wherein said compound further comprises one or more 
features selected from the group consisting of: 

each between residues X„ through X 17 , Z )2 and X„ and X 17 and Z„ is an 
amide linkage; 

Z„isH,N-; 

Z 14 is -C(0)OH or a salt thereof; 
optional Z l2 is not present; 
optional Z 13 is not present; 

X,, is other than a basic residue; 
X 13 is an aliphatic residue or M; 

X 14 is other than an aromatic residue; 

X 15 is an aliphatic residue, F or M; and 

X l7 is an aliphatic residue, E, M or a hydroxyl-substituted aliphatic residue. 

' The compound of claim 11 which is selected from the group consisting of SEQ ID 
NO: 1, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32, SEQ ID NO: 33, SEQ ID 
NO: 34, SEQ ID NO: 35, SEQ ID NO: 36, SEQ ID NO: 37, SEQ ID NO: 38, SEQ ID 
NO: 39, SEQ ID NO: 40, SEQ ID NO: 41, SEQ ID NO: 42, SEQ ID NO: 43, SEQ ID 
NO: 44, SEQ ID NO: 45, SEQ ID NO: 46, SEQ ID NO: 47, SEQ ID NO: 48, SEQ ID 
NO: 49, SEQ ID NO: 50, SEQ ID NO: 51 and SEQ ID NO: 52. 

The compound of any one of claims 1-12 wherein said compound exhibits 
antibacterial activity against a Gram-negative bacterium comprising Escherichia coli 
Haemophilus influenzae. Salmonella enteriditis. Salmonella typhimurium. Bordetella 
pertussts. Yersinia pestis. Yersinia enterocolitica. Helicobacter pylori and Klebsiella 
pneumoniae, 

A mannose analogue capable of competitively binding the amino terminal mannose- 
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binding domain of a Gram-negative bacterial adhesin/ 

15. The analogue of claim 14 wherein said compound exhibits antibacterial activity 
agamstaG^^ 

An*.** ^ ^^^^^ ^ ^ ^ 



pneumoniae. 



16. 



A composition comprising a compound according to any one of claims 1-15 and a 
pharmaceutical^ acceptable carrier, excipient or diluent. 



17. 



a subject, sa,d mchod compnrfng admki st e™g an effective 
ac ?SSS"SP n? .? fclaims 1-13. .^K'-a, . .; 



in 

amount of a compound 



18. 



A «hcd of Pre ve,,„ng „ r ^ forau: , TO of a 
accordmg to any one of claims 1-13. 
19. A nrethod of treating . „ jnftit . on compriatag adminkexingtoa^u, 

.20. The.e^dofc Wral9whereintebacteria , infeci . on . „ : 

«*U W„e// TOli . Yersinia pesiis. Yersinia enierocoliiica 
Helicobacter pylori and Klebsiella pneumoniae. 

»■ The method of any one of daima ,7-20 wherein the subject is a n™, orh^nan. 
22. The method of any one of ctoims !7-20 wherein thea„bjec t isap tot 
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23. 



A method of preventing or inhibiting pili adhesion to a host tissue, said method 
comprising administering a mannose analogue of claim 14 or 15. 



24. A method of preventing or inhibiting biofilm formation, said method comprising 
administering an effective amount of a compound of any one of claims 1-15 to an 
environment or surface containing Gram-negative bacteria. 

!5. A method for inhibiting bacterial colonization by a Gram-negative organism, said 
method comprising administering an effective amount of a compound of any one of 
claims 1-15 to an environment or surface containing Gram-negative bacteria. 

6. A composition comprising a pilus chaperone-subunit co-complex in crystalline form, 
wherein said co-complex comprises an amino acid sequence of a G, beta-strand of a 
chaperone and an amino acid sequence of an amino-terminal end of a pilus subunit. 

r. The composition o f claim 26 wherein said amino acid sequence of the G, beta-strand 
of the chaperone is derived from a N101 to L107 amino acid region of the G, beta- 
strand of a chaperone. 

The composition of claim 27 wherein the amino acid sequence derived from a G, 
beta-strand of a chaperone is SEQ ID NO: 1 . 

• The composition of any one of claims 26-28 wherein the amino acid sequence derived 
from ah amino acid sequence of an amino-terminal end of a pilus subunit is SEQ ID 
NO: 12. 

The composition of claim 26 wherein the pilus chaperone-subunit co-complex in 
crystalline form is a PapD-PapK chaperone-subunit co-complex. 
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31. The composition of claim 30 wherein the crystal has a space group, of P2 l 2 l 2 l with 
unit cell dimensions of a = 62.1 ± 0.2 angstroms, b = 63.6 ± 0.2 angstroms and c = 
927 ±0.2 angstroms. 

32. The composition of claim 31, wherein said crystal is of diffraction quality. 

33. The composition of claim 3 1 , wherein said crystal is a native crystal. 

34. The composition of claim 3 1 , wherein said crystal is a heavy-atom derivative crystal. 

35. The composition of claim 31, wherein at least one of PapD or PapK of the PapD- 
PapK chaperone-subunit co-complex is a mutant. . 

36. -The crystal of claim 35; wherenrthe mutant is a- selenomethionine or selenocysteine 
mutant. , 

.37. Theciyst^ofcla^ 

38. The crystal of claim 35, wherein the mutant is a truncated or extended mutant. 

39. The composition of claim 31, wherein said crystal is produced by a method 
comprising the steps 6fT 

(a) mixing a volume of a solution comprising the PapD-PapK chaperone- 
subunit co-complex with a volume of a reservoir solution comprising a 
precipitant; and 

(b) incubating the mixture obtained in step (a) over the reservoir solution 
in a closed container, under conditions suitable for crystallization until 
the crystal forms. 



40. 



A method of producing a PapD-PapK chaperone-subunit co-complex in crystalline 
form, said method comprising: 
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41. 



42. 



[3. 



(a) 



mixing a volume of a solution comprising the PapD-PapK chaperone- 
subunit co-complex with a volume of a reservoir solution comprising a 
precipitant; and 

(b) incubating the mixture obtained in step (a) over the reservoir solution 
m a closed container, under conditions suitable for crystallization until 
the crystal forms. 

A method of identifying an antibacterial compound, comprising the step of using a 
three-dimensional structural representation of a pilus chaperone-subunit co-complex 
or a fragment thereof comprising a G, beta-strand binding cleft, to computationally ' 
screen a candidate compound for an ability to bind the G, beta-strand binding cleft of 
the pilus subunit. 

The method of claim 41 further comprising the steps of: 
synthesizing the candidate compound; and 
screening the candidate compound for antibacterial activity. 

The method of claim 42 wherein the three dimensional structural information 
comprises the atomic structure coordinates of a PapK subunit. 



4. The method of claim 44 wherein the three dimensional stmcmral infonnation further 
compnses the atomic structure coordinates of residues comprising the G, beta strand 
binding cleft of a PapK subunit. 

• The method of claim 43 or 44 wherein the atomic structure coordinates are obtained 
from the atomic structure coordinates of a PapD-PapK chaperone-subunit co-complex. 

The method of claim 45 wherein the PapD-PapK co-complex atomic structure 
coordinates are those coordinates deposited at the Protein Data Bank under entry code 



The method of claim 42 wherein the structural information 



comprises the atomic 
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90 

structure coordinates of a FimH subunit. ' 



48. The method of claim 47 wherein the structural information further comprises the 
atomic structure coordinates of residues comprising a G, beta-strand binding cleft of ; 
FimH subunit. 

49. The method of claim 47 or 48 wherein the atomic structure coordinates are obtained 
from the atomic structure coordinates of a FimC-FimH chaperone-adhesin co- 
complex. 

50. The method of claim 49 wherein the atomic struc ture coordinates are those 
coordinates deposited at the Research Collaborator/ for Structural Bioinformatics 
Protein Data Bank under entry code 1QUN. 



A method of identifying an antibacterial compound comprising the step of using a 

or a fragment thereof comprising a G, beta-strand binding cleft, to computationally 
design a synthesizable candidate compound that binds the G, beta-strand binding cleft 
of a pilus subunit. 

, Themeth0d 0 i cIdm - 5 L w herein the compuiaUonal designxomprises^thesteps ofr 
identifying chemical entities or fragments capable of associating with the G, 
beta strand binding cleft of the chaperone subunit; and 

assembling the chemical entities or fragments into a single molecule to 

provide the structure of the candidate compound. 

The method of claim 52 further comprising the steps of: 
synthesizing the candidate compound; and 
screening the candidate compound for antibacterial activity. 

The method of claim 53 wherein the structural information comprises the atomic 
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structure coordinates of a PapK subunit. 



55. 



The method of claim 54 wherein the structural information further comprises the 
atomic structure coordinates of residues comprising the G. beta-strand binding cleft of 
a PapK subunit. 

56. The method of claim 54 or 55 wherein the atomic structure coordinates are obtained 
from the atomic structure coordinates of a PapD-PapK chaperone-subunit co-complex. 



57. 



The method of claim 56 wherein the atomic structure coordinates of the PapD-PapK 
co-complex are those coordinates deposited at the Protein Data Bank under entry code 



58. 



59. 



1PDK. 



The method of claim 53 wherein the structural information comprises the atomic - 
structure coordinates of a FimH subunit. 

The method of claim 58 wherein the structural information comprises the atomic 
structure coordinates of residues comprising a G, beta-strand binding cleft of a FimH 
subunit. 



60. 



The method of claim 58 or 59 wherein the atomic structure coordinates are obtained 
from the atomic structure coordinates of a FimC-FimH chaperone-adhesin co- 
complex. 

6 1 • The method of claim 60 wherein the atomic structure coordinates of the FimC-FimH 
chaperone-adhesin are those coordinates deposited at the Research Collaborator/ for 
Structural Bioinformatics Protein Data Bank under entry code IQUN. 



62. 



A method of identifying a compound having antibacterial activity, comprising the step 
of usmg a three-dimensional structural representation of a chaperone, or a fragment 
thereof comprising a G, beta-strand, to identify or design a compound having a three- 
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63. 



64. 



65. 



'6. 



7. 



dimensional structure similar to the three-dimensional structure of the G, beta-strand 
of the chaperone. 



The method of claim 62 wherein me three-dimensional structural information 
comprises the atomic structure coordinates of residues comprising a G, beta-strand of 
a PapD chaperone subunit or a FimC chaperone. 

The method of claim 63 wherein the three-dimensional structural information 
comprises the atomic structure coordinates of a PapD chaperone. 

The method of claim 63 or 64 wherein the atomic structure coordinates of the PapD 
chaperone are obtained from the atomic structure coordinates of a PapD-PapK 
chaperone-subunit co-complex. 

The method of claim 65 wherein the atomic structure coordinates of the PapD-PapK 
chaperone-subunit co-complex are those deposited at the Protein Data Bank under 
entry code 1PDK. - -r ,.\ . .=-. . ..; ■:--,.-]:■ ,-.<*■':, 

The method of claim 63 wherein the three-dimensional structural information 
comprises the atomic structure coordinates of a FimC chaperone. 

--The-memod of claims 
chaperone are obtained from the atomic structure coordinates of a FimC-FimH 
chaperone-adhesin co-complex. 

The method of claim 68 wherein the structure coordinates of the FimC-FimH 
chaperone-adhesin co-complex are those deposited at the Research Collaborator for 
Structural Bioinformatics Protein Data Bank under entry code 1QUN. 

A method of identifying an antibacterial compound, said method comprising the step 
of using a three-dimensional structural representation of an adhesin, or a fragme 



lent 
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71. 



thereof comprising a lectin binding domain or portion thereof, to screen a candidate 
compound for the ability to bind a lectin binding domain of the adhesin. 

The method of claim 70, further comprising the steps of: ' 

synthesizing the candidate compound; and 
assaying the candidate compound for antibacterial activity. 

72. The method of claim 71 wherein the three-dimensional structural information 
comprises the atomic structure coordinates of a FimH adhesin. 



73. 



4. 



5. 



The method of claim 72 wherein the three-dimensional structural information further 
comprises the atomic structure coordinates of residues comprising a lectin binding 
domain of a FimH adhesin or portion thereof. 

The method of claim 72 or 73 wherein the atomic structure coordinates are obtained 
from the structure coordinates of a FimC-FimH chaperone-adnesin co-complex. 

The method of claim 74 wherein the structure coordinates of the FimC-FimH 
chaperone adhesin co-complex are those deposited at the Research Collaborator/ for 
Structural Bioinformatics Protein Data Bank under entry code 1QUN. 

A method of identifying an antibacterial compound comprising the step of using a 
three-dimensional structural representation of an adhesin, or a fragment thereof 
comprising a lectin binding domain or portion thereof, to computationally design a 
compound that binds the lectin binding domain of the adhesin. 

The method of claim 76 wherein the computational design comprises the steps of: 
identifying chemical entities or fragments capable of associating with the 
lectin binding domain; and 

• assembling the chemical entities or fragments into a single molecule to 
provide the structure of the candidate compound. 
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78. The method of claim 77, further comprising the steps of: 
synthesizing the candidate compound; and 
screening the candidate compound for antibacterial activity. 

79 - . .. . The method of claim ,78 wherein Ui<?. three-dimensional structural- information 
comprises the atomic structure coordinates of a FimH adhesin. 

80. The method of claim 79 wherein the three-dimensional structural information further 
comprises the atomic structure coordinates of residues comprising a lectin binding 
domain of a FimH adhesin. 

81. - The method of claim 79 or 80 wherein the atomic structure coordinates are obtained 

from the structure coordinates of a .FimC^F^^bjperone-adhesin co-complex or 
portion thereof. .' . .. 



2. The method of claim" 8 1 Wherein the structure coordinates of the FimC-FimH 

chaperone-adhesin co-complex are those deposited at the Research Collaborators for 
Structural Bioinformatics Protein Data Bank under entry code 1QUN. 

dimensional structural representation of a crystalline piius chaperone-subunit co- 
complex or a fragment or portion thereof. 

t. The machine-readable medium of claim 83 wherein the pilus chaperone-subunit co- 
complex is a PapD-PapK chaperone-subunit co-complex. 

The machine-readable medium of claim 84 wherein at least one subunit of the PapD- 
PapK co-complex is a mutant. 

The machine-readable medium of claim 85 wherein the mutant is a selenomethionine 
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or selenocysteine mutant. 



87. The machine-readable medium of claim 85 wherein the mutant is a conservative 
mutant. 



88. The machine-readable medium of claim 84, in which the information comprises 
atomic structure coordinates, or a subset thereof, 

89. The machine-readable medium of elaim 88 wherein the atomic structure coordinates 
are those deposited a, the Protein Data Bank under entry code IPDK, or a subset 
thereof. ■' 

90. The machine-readable medium of claim 83 wherein the pilus chaperone-subunit co- 
complex is a FimC-FimH chaperone-adhesin co-complex. 

91. The machine-readable medium of claim 90 wherein at least one subunit of the FimC- 
F.mH chaperone-adhesin co-complex is a mutant. 

92. The machine-readable medium of claim 91 wherein the mutant is a selenomethionine 
or selenocysteine mutant. 

93. The machine-readable medium of claim 91 wherein the mutant is a conservative 
mutant. 

94. The machine-readable medium of claim 90, in which the information comprises 
atomic structure coordinates, or a subset thereof. 

95. The machine-readable medium of claim 94 wherein the atomic structure coordinates 
are those deposited at the Research Collaborator for Structural Bioinformatics 
Protein Data Bank under entry code 1QUN, or a subset thereof. 
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Fig. 3A 
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Fig. 3C 
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FIG. 7C 
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FIG. 11A 
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SEQUENCE LISTING 
<110> WASHINGTON UNIVERSITY 

<120> ANTI-BACTERIAL COMPOUNDS DIRECTED AGAINST PILUS 
BIOGENESIS, ADHESION AND ACTIVITY; CO-CRYSTALS OF PILUS 
SUBUNITS AND METHODS OF USE THEREOF 

<130> WSHU2005.2 

<140> 
<141>. 

<150> US 60/148,280 
<151> 1999-08-11 

<160>56 

<170> Patentln Ver. 2.1 
<210> 1 

<2n>7 '. . ' - 

<212>PRT 

<213> Artificial Sequence 
<220> '" " " '■■ 

<223> Description of Artificial Sequence: Synthesized 

Sequence ' , . 

<400> 1 

Asn Val Leu Gin He Ala Leu 

1 . 5 •• ' ■;• ■ 



<210>2 
.<211>10 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 2 

Gly Lys Val Thr Phe Asn Gly Thr Val Val 
1 5 10 



<210>3 
<211> 10 
<212>PRT 

<213> Artificial Sequence 



PCT/US00/22087 

2 



<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 3 

Gly Thr Val His Phe Lys Gly Glu Val Val 
1 5 10 



<210>4 

<211>10 

<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 4 

Gly Lys Val Thr Phe Phe Gly Lys Val Val 
1 5 10 



<210>5 
<211> 10 
<212>PRT 

<213> Artificial Sequence 
<220> ------ — -- — 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400>5 

Gly Thr He Val He Thr Gly Thr He Thr 
1 5 10 



<210>6 

<2U>10 

<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 6 

Gly Thr He Val He Thr Gly Ser lie Ser 
1 5 10 
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<210>7 

<211> 10 
<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthesized 
Sequence 

<400> 7 

GlyThrVal Lys Phe Val Gly Ser lie lie 
1 5 10 



<210>8 
<211> 10 
<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence- . _ 

<400> 8 

Gly Glu He Gin Leu Lys Gly Glu He Val 

1 5 .. 10 



<210>9 

<211>10 

<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400>9 

Gly Thr He Lys Phe Thr Gly Glu He Val 
1 5 10 



<210> 10 
<211> 10 
<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400>10 

Asn Glu Val Thr Phe Leu Gly Ser Val Ser 
1 5 10 
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<210> 11 
<211> 10 
<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 1 1 

Gly Thr lie Asn Phe Glu Gly Ser Val Val 
1 5 io 



<210> 12 
<211>10 
<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 12 

Ser Asp Val Ala Phe Arg Gly Asn Leu Leu 
1 5 io 



<210> 13 
<211>10 
<212>PRT 

<213> Artificial Sequence 
<220> 

<22 ^Description of Artificial Sequence: Synthesized 
<400> 13 

Gly Arg Ala Ala Phe His Gly Glu Val Val 
1 5 10 

<210> 14 
<211>10 
<212>PRT 

<213> Artificial Sequence 
<220> 

^^equence^ 0 " ° f Artificial Se< iuence: Synthesized 
<400> 14 

Gly Arg Ala Thr Phe His Gly Glu Val Val 
1 5 . 10 
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<210> 15 
<211> 10 
<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 15 ... 

Asp Asn Leu Thr Phe Arg Gly Lys Leu He 
1 5 10 



<210> 16 
<211>10 
<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence ~ " ~" ■■ 

<400> 16 

Asp Asn Leu Thr Phe Lys Gly Lys Leiille 
1 5 10 



<210> 17 
<211>10 
<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence- Synthesized 
Sequence 



<400>17 

Gly Trp Leu Asn Leu Gin Gly Thr He Leu 
1 5 in 



<210> 18 
<211> 10 
<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 18 

Ser Val Val Asn He Thr Gly Asn Val Gin 
.1 5 10 
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<210> 19 
<211> 10 
<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400>19 

Thr Thr He Thr Val Thr Gly Asn Val Leu 
1 5 10 



<210>20 
<211> 10 
<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 20 

Thr Thr lie Thr Val Thr Gly Are Val Leu 
1 5 10 



<210>21 
<211>10 
<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400>21 

Cys Met Leu Ala Gly Ser Asn Phe Val Thr 

1 . 5 10 

<210>22 
<211>10 
<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 22 

Val Gin He Asn He Arg Gly Asn Val Tyr 
15 10 
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<210>23 
<211> 10 
<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400>23 

Pro Asn Leu Lys Leu Phe Gly Thr Leu Leu 

1 5 " 10 ' 



<210>24 
<21 1> 10 
<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 



<400>24 

Val Tyr He Asn He Thr Gly Asn Val He 
1 5 10 



<210>25 
<211> 10 
<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 25 - - • - — — 

Gly Lys lie Thr Phe Asn Gly Lys Val Val 
15 10 



<210>26 
<211>10 
<212>PRT 

<213> Artificial Sequence 
<220> . 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400>26 

Gly Thr He Asn Phe Asn Gly Lys He Thr 
1 5 10 
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<210>27 
<211>10 
<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400>27 

Gin Lys Thr He Phe Ser Ala Asp Val Val 
1 5 10 



<210>28 
<211>10 
<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 28 

Gly Gin Val Asn Phe Phe Gly Lys Val Thr 
1 5 10 



" <210> 29 ~ 

<211> 10 
<212>PRT 

<213> Artificial Sequence 
<220> . 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400>29 

Gin Arg Thr He He Thr Ala Asp Val Val 
15 10 



<210> 30 

<2ll>7 

<212>PRT 

<213> Artificial Sequence 
<220> 

^ 2 Sequence Pti ° n ° f Artificial Se< l uence: Synthesized 
<400> 30 

Gly Ser Leu Ser Leu Ala He 
1 5 
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<210>31 

<211>7 
<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400>31 

Asn Tyr Leu Gin Phe Ala lie 
1 5 



<210>32 
<211>7 
<212> PRT 

<2 1 3> Artificial S equence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400>32 

SerGlylleAlaValAlaLeu 
1 5 



<210> 33 
<211>7 
<212> PRT 

<2 13> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<4b0>33 

Asn He Leu Gin Leu Ala He 
1 5 



<210>34 
<211>7 
<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400>34 

SerPhe Met Gin He Ala He 
1 5 



WO 01/10386 



10 



<210>35 

<211>7 

<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400>35 

Asn Tyr Leu Gin Phe Ala Val 
15 



<210>36 
<211> 7 
<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 36 

Asn Thr Leu Gin Leu Ala He 
1 5 



<210>37 

<211>7 

<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400>37 

Gly Val Leu Gin Leu Thr He 
1 5 



<210>38 

<211>7 

<212>PRT 

<213> Artificial Sequence 
<220> 

^^eqwnce Pti0 " ° f Artificial Se 1 uence: Synthesized 
<400> 38 

Asn Val Leu Ala Val Ala Val 
1 5 



WO 01/10386 




PCT/US00/22087 



<210>39 

<211>7 

<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400>39 

Ser Leu Leu Gin Leu Ala Phe 
1 5 



<210>40 
<211>7 
<212> PRT 

<213> Artificial Sequence 

<220> . ■_ . .': : . . 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400>40 . 

Ser Gly lie Ala Val Ala Val . 
1 5 



<210>41 
<211>7 
<212> PRT 

<213> Artificial Sequence 
<220> 

. <223> Description of Artificial_Sequence: Synthesized 
Sequence 

<400> 41 

Asn Ala Leu Lys Phe Ala Met 
15 



<210>42 

<211>7 

<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 42 

Asn Val Leu Gin Met Ala Met 
I 5 



WO 01/10386 



12 



<210>43 

<211>7 
<212>PRT 

<2 1 3> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400>43 

Asn Tyr Leu Gin Phe Ala He 
1 5 



<210>44 

<211>7 

<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400>44 

Asn Val Leu Gin He Ala Val 
15 



<210>45 ' ' " 

<211>7 

<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400>45 

Leu Asn Val Asn Val Val Thr 
1 5 



<210>46 

<211>7 

<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 46 

Val Phe Val Gin Phe Ala He 
1 5 



WO 01/10386 



13 



PCT/US00/22087 



<210>47 

<211>7 

<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 47 

Met Lys Leu Asn Val Ser lie 
1 5 

<210>48 
<211>7 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
.. Sequence 

<400>48 

Met Asp He Gin Met Ser He 
1.5 



<210>49 

<211>7 

<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 49 ~ ' " " ". 
Leu Asn He Leu Leu Ser Val 
1 5 



<210>50 

<211>7 

<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence; Synthesized 
Sequence 

<400> 50 

Met Asn He Gin Val Ser Val 
15 



WO 01/10386 



14 



<210>S1 

<21i>7 

<212>PRT 

<2 13> Artificial Sequence 
<220> 

^ 2 Seq?e e nce PtiQn ° f Artificial Se< J u ence: Synthesized 
<400>51 

AspSer IleAsnlleSerlle 
1 5 



<210>52 "... 

<211>7 

<212>PRT 

<2 13> Artificial Sequence 
<220> 

<22 Seque"ce Pti ° n ° f ArtificiaI Se 9 uenc * Synthesized 
<400> 52 

Leu Asn Val Gin Leu Ser Val 

1 - - 5 ... : 



<210>53 — — — - - — — -~- 

<211>22 

<212>DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 
<400> 53 

catcgctggc acaggaagga gc 2 2 

<210>54 " 
<211>24 ... 

<212>DNA '.; 
<2 1 3> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: Primer 
<400>54 

gttggtatgacccgcatcaatcgc 2 4 



WO 01/10386 




15 



PCT/US00/22087 



<210>55 
<211>10 
<212>PRT 

<2I3> Artificial Sequence 
<220> 

^ 2 protefns ripti0n ° f ArtiflCiaI Se 4 uence: Synthesized 
<400> 55 

Asn Thr Leu Gin Leu Ala He lie Ser.Arg 



<210>56 

<211>9 

<212>PRT 

<213> Artificial Sequence 
<220> 



^ 2 |mSfn e Cripti0n ° fArtificiaI Se q^nce: Synthesized 



<400> 56 

Asp Val Thr He Thr Val Asn Gly Lys 



